pas_a_pas/EPE.html at master · chabefer/pas_a_pas · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />


<title>Econometrics of Program Evaluation</title>

<script src="site_libs/jquery-1.11.3/jquery.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="site_libs/bootstrap-3.3.5/css/bootstrap.min.css" rel="stylesheet" />
<script src="site_libs/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/respond.min.js"></script>
<script src="site_libs/navigation-1.1/tabsets.js"></script>
<link href="site_libs/highlightjs-1.1/default.css" rel="stylesheet" />
<script src="site_libs/highlightjs-1.1/highlight.js"></script>

<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
  pre:not([class]) {
    background-color: white;
  }
</style>
<script type="text/javascript">
if (window.hljs && document.readyState && document.readyState === "complete") {
   window.setTimeout(function() {
      hljs.initHighlighting();
   }, 0);
}
</script>


<style type="text/css">
h1 {
  font-size: 34px;
}
h1.title {
  font-size: 38px;
}
h2 {
  font-size: 30px;
}
h3 {
  font-size: 24px;
}
h4 {
  font-size: 18px;
}
h5 {
  font-size: 16px;
}
h6 {
  font-size: 12px;
}
.table th:not([align]) {
  text-align: left;
}
</style>


</head>

<body>

<style type = "text/css">
.main-container {
  max-width: 940px;
  margin-left: auto;
  margin-right: auto;
}
code {
  color: inherit;
  background-color: rgba(0, 0, 0, 0.04);
}
img {
  max-width:100%;
  height: auto;
}
.tabbed-pane {
  padding-top: 12px;
}
button.code-folding-btn:focus {
  outline: none;
}
</style>


<style type="text/css">
/* padding for bootstrap navbar */
body {
  padding-top: 51px;
  padding-bottom: 40px;
}
/* offset scroll position for anchor links (for fixed navbar)  */
.section h1 {
  padding-top: 56px;
  margin-top: -56px;
}

.section h2 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h3 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h4 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h5 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h6 {
  padding-top: 56px;
  margin-top: -56px;
}
</style>

<script>
// manage active state of menu based on current page
$(document).ready(function () {
  // active menu anchor
  href = window.location.pathname
  href = href.substr(href.lastIndexOf('/') + 1)
  if (href === "")
    href = "index.html";
  var menuAnchor = $('a[href="' + href + '"]');

  // mark it active
  menuAnchor.parent().addClass('active');

  // if it's got a parent navbar menu mark it active as well
  menuAnchor.closest('li.dropdown').addClass('active');
});
</script>


<div class="container-fluid main-container">

<!-- tabsets -->
<script>
$(document).ready(function () {
  window.buildTabsets("TOC");
});
</script>

<!-- code folding -->


<div class="navbar navbar-default  navbar-fixed-top" role="navigation">
  <div class="container">
    <div class="navbar-header">
      <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
      </button>
      <a class="navbar-brand" href="index.html">Sylvain Chab&eacute;-Ferret's &laquo;Pas-&agrave;-Pas&raquo;</a>
    </div>
    <div id="navbar" class="navbar-collapse collapse">
      <ul class="nav navbar-nav">
        <li>
  <a href="index.html">Home</a>
</li>
<li>
  <a href="Inference.html">Causal Inference</a>
</li>
<li>
  <a href="AEP.html">Agri-Environmental Policies</a>
</li>
<li>
  <a href="Education.html">Education</a>
</li>
<li>
  <a href="TEES.html">Teaching Economics</a>
</li>
<li>
  <a href="about.html">About Me</a>
</li>
      </ul>
      <ul class="nav navbar-nav navbar-right">

      </ul>
    </div><!--/.nav-collapse -->
  </div><!--/.container -->
</div><!--/.navbar -->

<div class="fluid-row" id="header">


<h1 class="title toc-ignore">Econometrics of Program Evaluation</h1>

</div>


<p>This course covers the basic theoretical knowledge and technical skills required for implementing Econometric Methods of Causal Inference. These methods are used to tests predictions of economic theories and also to measure the impacts of programs. These tools have been developed by social scientists, natural scientists and statisticians over the course of the last century. Over the last thirty years, economists have regrouped most of these tools in a standard toolkit. In this class, we will study this basic set of tools. These tools have been developed and/or are heavily used in labor, education, development, health and environmental economics. They are also used by funding agencies and governments to conduct evaluations of public policies and start being used by firms to evaluate product design, auction design, advertisement, etc.</p>
<p>The aim of this class is threefold:</p>
<ol style="list-style-type: decimal">
<li>Provide the (minimal) mathematical underpinning required to apply Econometric Methods of Causal Inference</li>
<li>Provide the R code in order to apply these methods</li>
<li>Make extremely clear the statistical issues that these methods face and try to suggest solutions. I especially focus on sampling noise and the perils of significance testing. I also provide a description of the statistical tools required to detect and correct for publication bias: meta-analaysis, p-curves, etc.</li>
</ol>
<p>The course is structured in three broad sequences:</p>
<ol style="list-style-type: decimal">
<li><a href="FPI.html">The Two Fundamental Problems of Inference</a></li>
</ol>
<ul>
<li><a href="RCM.html">Rubin Causal Model</a>: the basic language to encode causality.</li>
<li>Treatment Effects: our causal parameters of interest. We are going to focus most of the time on <span class="math inline">\(TT\)</span>, the average effect of the Treatment on the Treated.</li>
<li>The Fundamental Problem of Causal Inference (FPCI): the Treatment Effects of interest can NEVER be observed, even with a sample of infinite size (a very acute problem indeed!). What we can do instead is to use transformations of the observed data that, under certain assumptions, are equal to the Treatment Effect of interest when the sample size is infinite.</li>
<li>The Biases of Intuitive Comparisons: a consequence of the FPCI is that the intuitive comparisons that we use for causal inference (the before/after and with/without comparisons) are generally biased because of factors that determine both the outcomes of the program and who receives it. These factors are called confounding factors.</li>
<li>The Fundamental Problem of Statistical Inference (FPSI): in practice, sample sizes are finite. As a consequence, in each sample, our estimator differs from the Treatment Effect of interest. This phenomenon is called sampling noise. We will cover two useful statistical tools to help with this problem: gauging the size of the sampling noise ex-post; choosing sample size ex-ante to decrease sampling noise. I cover three ways to estimate sampling noise:</li>
<li>Asymptotic theory using the Central Limit Theorem (CLT)</li>
<li>The Bootstrap</li>
<li>Randomization Inference</li>
<li>The perils of significance testing: specification search and publication bias. I suggest to NEVER use statistical tests and I explain why. I suggest to gauge sampling noise instead.</li>
</ul>
<ol start="2" style="list-style-type: decimal">
<li>Methods of Causal Inference In this section, we learn the three sets of methods that are used by economists in order to suppress the influence of confounding factors and estimate Treatment Effects. For each estimator, we will cover identification (how it solves the fundamental problem of causal inference absent sampling noise), estimation (how to compute an estimator with a sample) and precision (how to gauge the sensitivity of our estimate to sampling noise with independently and identically distributed (i.i.d.) observations).</li>
</ol>
<ul>
<li>Randomized Controlled Trials (RCTs) solve for the problem of the confounding factors by allocating the treatment at random, i.e. independently of the confounders. We will cover the four most used RCT designs: randomization by brute force, after self-selection, after eligibility and encouragement designs.</li>
<li>Natural Experiments leverage on features of the implementation of the program that approximate the conditions of a RCT. We are going to cover the three most used natural experiment methods: Instrumental Variables (IV), Difference-In-Differences (DID) and Regression Discontinuity Designs (RDD).</li>
<li>Observational methods try to measure the confounders and to account separately for their effects on the outcomes. Standard observational methods that we are going to study are OLS and Matching. I am also going to dedicate some time to more recent Observational Methods based on Machine Learning (ML).</li>
</ul>
<ol start="3" style="list-style-type: decimal">
<li>Additional important topics</li>
</ol>
<ul>
<li>Power analysis: before implementing a given method, we want either to choose the sample size required to reach a pre-specified level of precision or to gauge the level of precision we might reach with a pre-specified sample size.</li>
<li>How to estimate precision when observations are not i.i.d.</li>
<li>Placebo tests: tests that we implement in order to check the validity of natural experiments and of observational methods.</li>
<li>LaLonde tests: check whether observational methods and natural experiments can reproduce the results of RCTs.</li>
<li>Analysis of diffusion effects.</li>
<li>Analysis of distributive effects.</li>
<li>Meta-analysis.</li>
</ul>
<p>I use <span class="math inline">\(X_i\)</span> to denote random variable <span class="math inline">\(X\)</span> all along the class. I assume that we have access to a sample of <span class="math inline">\(N\)</span> observations indexed by <span class="math inline">\(i\in\left\{1,\dots,N\right\}\)</span>. ‘’<span class="math inline">\(i\)</span>’’ will denote the basic sampling units when we are in a sample, and a basic element of the probability space when we are in populations. Introducing rigorous measure-theoretic notations for the population is feasible but is not necessary for comprehension.</p>
<p>When the sample size is infinite, we say that we have a population. A population is a very useful fiction for two reasons. First, in a population, there is no sampling noise: we observe an infinite amount of observations, and our estimators are infinitely precise. This is useful to study phenomena independently of sampling noise. For example, it is in general easier to prove that an estimator is equal to <span class="math inline">\(TT\)</span> under some conditions in the population. Second, we are most of the time much more interested in estimating the values of parameters in the population rather than in the sample. The population parameter, independent of sampling noise, gives a much better idea of the causal parameter for the population of interest than the parameter in the sample. In general, the estimator for both quantities will be the same, but the estimators for the effetc of sampling noise on these estimators will differ. Sampling noise for the population parameter will generally be larger, since it is affected by another source of variability (sample choice).</p>


</div>

<script>

// add bootstrap table styles to pandoc tables
function bootstrapStylePandocTables() {
  $('tr.header').parent('thead').parent('table').addClass('table table-condensed');
}
$(document).ready(function () {
  bootstrapStylePandocTables();
});


</script>

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

</body>
</html>