pas_a_pas/RCM.Rmd at master · chabefer/pas_a_pas · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
title: "Rubin Causal Model"
---

```{r, echo=FALSE}
library(xtable)
```


Rubin Causal Model (RCM) is made of three distinct building blocks: a treatment allocation rule, that decides who receives the treatment; potential outcomes, that measure how each individual reacts to the treatment; the switching equation that relates potential outcomes to observed outcomes through the allocation rule.

# The treatment allocation rule

The first building block of the RCM is the treatment allocation rule.
Throughout this class, we are going to be interested in inferring the causal effect of only one treatment with respect to a control condition.
Extensions to multi-valued treatments are in general self-explanatory.

In the RCM, treatment allocation is captured by the variable $D_i$.
$D_i=1$ if unit $i$ receives the treatment and $D_i=0$ if unit $i$ does not receive the treatment and thus remains in the control condition.

The treatment allocation rule is critical for several reasons.
First, because it switches the treatment on or off for each unit, it is going to be at the source of the FPCI.
Second, the specific properties of the treatment allocatoin rule are going to matter for the feasibility and bias of the various econometric methods that we are going to study.

Let's take a few examples of allocation rules.
First, let's imagine a treatment that is given to individuals.
Whether each individual receives the treatment partly depends on the level of her outcome before receiving the treatment.
Let's denote this variable $Y^B_i$, with $B$ standing for "Before".
It can be the health status assessed by a professional before deciding to give a drug to a patient.
It can be the poverty level of a household used to assess its eligibilty to a cash transfer program.

## The sharp cutoff rule

The sharp cutoff rule means that everyone below some threshold $\bar{Y}$ is going to receive the treatment.
Everyone whose outcome before the treatment lies above $\bar{Y}$ does not receive the treatment.
Such rules can be found in reality in a lot of situations.
They might be generated by administrative rules.
One very simple way to model this rule is as follows:
$$
D_i = \uns{Y_i^B\leq\bar{Y}},
$$
where $\uns{A}$ is the indicator function, taking value $1$ when $A$ is true and $0$ otherwise.

<!-- \begin{numexample} -->
Imagine for example that $Y_i^B=\exp(y_i^B)$, with $y_i^B=\mu_i+U_i^B$, $\mu_i\sim\mathcal{N}(\bar{\mu},\sigma^2_{\mu})$ and $U_i^B\sim\mathcal{N}(0,\sigma^2_{U})$.
Now, let's choose some values for these parameters so that we can generate a sample of individuals and allocate the treatment among them.
I'm going to switch to R for that.
```{r param.init,eval=TRUE,echo=TRUE,results='markup'}
param <- c(8,.5,.28,1500)
names(param) <- c("barmu","sigma2mu","sigma2U","barY")
param
```
Now, I have choosen values for the parameters in my model.
For example, $\bar{\mu}=$\Sexpr{param["barmu"]} and $\bar{Y}=$\Sexpr{param["barY"]}.
What remains to be done is to generate $Y_i^B$ and then $D_i$.
For this, I have to choose a sample size ($N=1000$) and then generate the shocks from a normal.

```{r YiBD,eval=TRUE,echo=TRUE,results='hide'}
# for reproducibility, I choose a seed that will give me the same random sample each time I run the program
set.seed(1234)
N <-1000
mu <- rnorm(N,param["barmu"],sqrt(param["sigma2mu"]))
UB <- rnorm(N,0,sqrt(param["sigma2U"]))
yB <- mu + UB
YB <- exp(yB)
Ds <- ifelse(YB<=param["barY"],1,0)
```

Let's now build a histogram of the data that we have just generated.

```{r histyb,eval=TRUE,echo=TRUE,results='hide',fig.cap='Histogram of $y_B$',fig.align='center',out.width='.5\\textwidth'}
# building histogram of yB with cutoff point at ybar
# Number of steps
Nsteps.1 <- 15
#step width
step.1 <- (log(param["barY"])-min(yB[Ds==1]))/Nsteps.1
Nsteps.0 <- (-log(param["barY"])+max(yB[Ds==0]))/step.1
breaks <- cumsum(c(min(yB[Ds==1]),c(rep(step.1,Nsteps.1+Nsteps.0+1))))
hist(yB,breaks=breaks,main="")
abline(v=log(param["barY"]),col="red")
```

You can see on Figure~\ref{fig:histyb} a histogram of $y_i^B$ with the red line indicating the cutoff point: $\bar{y}=\ln(\bar{Y})=$\Sexpr{log(param["barY"])}.
All the observations below the red line are treated according to the sharp rule while all the one located above are not.
In order to see how many observations eventually receive the treatment with this allocation rule, let's build a contingency table.

```{r table.D.sharp,eval=TRUE,echo=TRUE,results='asis',warning=FALSE,error=FALSE,message=FALSE}
table.D.sharp <- table(Ds)
xtable(table.D.sharp,caption='Treatment allocation with sharp cutoff rule',label='tab:table.D.sharp')
```

We can see on Table~\ref{tab:table.D.sharp} that there are \Sexpr{table.D.sharp[which(names(table.D.sharp)==1)]} treated observations.
\end{numexample}