Econometrics:
Causal effects and experiment
Keisuke Kawata
IDEC
Estimation of causal effects
• In many cases, our interest is to estimate e.g.,)
• Can microfinance actually improve the welfare of poor households?
⇒The effect of microfinance on household expenditure and health status.
• By joining FTA, does the national economy actually growth?
⇒The effe t of FTA o ou tries’ GDP.
• Can high education actually lead to high income?
⇒The effect of college degree on income.
• The estimation of causal effects are most important research issues in modern empirical works!!!
Plan of talks
1. Definition of the causal effect
2. Core idea: What is problem?, and How to overcome? 3. Estimation of the average causal effect by observed data. 4. Estimation of the average causal effect by experimental data.
• The effect of an action on an outcome if the outcome is the direct result, or consequence, of that actoin.
e.g.)
• Touching a hot stove causes you to get burned.
• Drinking water causes you to be less thirsty.
⇒Above definition is too casual, and it is not then useful in actual empirical works.
• In this cource, we discuss statistical approaches called as Ru i ’s pote tial out o e odel.
Casual definition of causal effects
Treatments ��:
e.g.,) education, microfinance, FTA, the number of class size, etc. Potencial outcomes �� �� :
e.g.) Pote ial out o es of Keisuke’s fi a ial assets i ea h edu atio .
Definition of potential outcome
Treatment Potential outcome
� = ℎ� ℎ ℎ � ℎ� ℎ ℎ = , $
� = � ℎ� ℎ ℎ = 9 $
� = � � � � ℎ� ℎ ℎ = 3, $
: the ausal effe t ha gi g treat e t’s status fro to ′ . e.g.) Causal effects of drop out on financial assets.
Causal effect of drop out ⇒
Definition of causal effect
Treatment Potential outcome
� = ℎ� ℎ ℎ � ℎ� ℎ ℎ = , $
� = � = 9 $
� = � � � � = 3, $
: �’s status of treat e ts in the real world.
• We can observe an potential outcome � = � , while potential outcomes
� ≠ � can
Note: � ≠ � are called as
e.g.,) Because Keisuke has a bachlar degree, we cannot observe his assets if he drops out a college or did not try college examination.
⇒To know causal effects, we need any good of counterfactual outcomes.
Fundamental problem
Treatment Potential outcome
� = ℎ� ℎ ℎ � ℎ� ℎ ℎ =?
� = � =?
� = � � � � = 3 , $
• To estimate counterfactual outcomes, we often use an estimator as an outcome of
⇒In this approach, we need to observe an outcome of individuals who receives
e.g.,) To know the causal effect of college dropout o Keisuke’s assets, e eed to observe an outcome of college dropouts.
Estimator as outcomes of comparison
• Mr. Bill is a college dropout and his current financial asset is 80,000,000,000$.
• If we use value of his financial asset as an estimator, the estimator of causal effect is
YB drop out − YK college =
e.g., Mr. Bill
Treatment K’s pote tial outcome B’s pote tial outco e
� = ℎ� ℎ ℎ � ℎ� ℎ ℎ =? �� ℎ� ℎ ℎ =?
� = � =? �� =80,000,000,000$
� = � = 3 , $ �� =?
79,999,97 , $!!!
• We focus on binarry treatments; = if individual i recives a treatment, and
= if she/he does not recive.
⇒ The causal effect of the treatment on she/he can be defined as
Let suppose we find another individual j who does not recives a treatment, the estimator of the cusal effect on i is
⇒ Above estimator can be modified as
⇒The estimator is equal to the true effect iff
The condition to know the true causal effect
• The condition � = � means that we can know the true effect only if their outcomes are totally same when they do not recieve a treatment.
e.g.) To Mr. Bill’s asset as a good esti ator of Keisuke’s ou terfa tual, e should eili e that Keisuke’s asset should be same as Mr. Bill if Keisuke droped out from college.
⇒Ho e er, it is er diffi ult e ause their a ilit are huge differe e.
Interpretation
• In the real world, it is very hard to find any non-treated individuals who have exactly same outcome as a counterfactual.
⇒ In some cases, we can conduct e peri e t . 1. Find or make two exactly same subjects. 2. Treat only one subject.
3. Compare outcomes between subjects. e.g.,) Testi g perfor a e of e ar e gi e .
1. Make t o e a tl sa e ar e ept for e gi es; o e ar’s e gi e is e o e, hile a other ar’s e gi e is old o e.
2. Compare the maximum speed, fuel consumption, and other performance measure.
(Ideal) experiment
• If you can conduct the ideal experiment, you can know true causal effects from the data with just two samples!!!.
Ho e er….
• In the social science, this approach cannot apply to know the causal effects.
← It is hard to
Limitation of experiment
• Due to limitation of experiments, we change the research targets as
Average outcome if individuals are treated: Average outcome if individuals are not treated: Average causal effects:
• The conditonal average cuasal effects can be also defined. Average cuasal effect in treated group:
Average cuasal effect in controled group:
Average causal effects
• We try to estimate the causal effect by using non-experimental data (called as observed data).
Notation :
• �[� |� = ] and �[� |� = ] denote the population mean income of treated and non-treated individuals.
• � and � denote sub-sample means of treated and non-treated.
⇒ If our data is pure-random sampling and includes treated and non-treated samples, � and � are unbiased estimators of population conditional means.
⇒The BLUE of � � � = − �[� |� = ] is
Estimation using non-experimental data
• � − � is an unbiased estimator of the causal effect among treated group Iff
�[yi |Ti = ] − �[� |� = ] = � � � = − � � � =
⇒
• � − � is an unbiased estimator of the causal effect among controlled group Iff
�[yi |Ti = ] − �[� |� = ] = � � � = − � � � =
⇒
Conditons for conditional causal effects
• If� � � = = � � � = and � � � = = � � � = ,
� − � is an unbiased estimator of proof:
• By using the law of iterated expectations,
� � − � � = � � � � − � � � �
= � � � � � = − � � � = + − � � � � � =
− � � � =
• The sample average of � is an unbiased estimator of � � .
Conditons for unconditional causal effects
Proposition: Sufficient conditon of an unbiased estimators
If� � � = = � � � = ′ and � � ′ � = = � � ′ � = ′ , the sample difference among � = and � = ′ samples is an unbiased estimator of the causal effect
Important proposition
• If the average value of potencial outcome is same among different treatment status, the sample difference is an unbiased estimator of the causal effect.
• If the sample size in each treatment status is enough, we can conduct valid statistically test whether find a causal effect or not.
Endogenous bias
• If treatment status are determined by individuals.
⇒
e.g.,) college drop out
You may drop out from college if
• you have a good business idea ⇒
⇒ the sample difference must be a biased estimator of the causal effect among non-droppers.
• the study in college is too hard for you⇒
⇒the sample difference must be a biased estimator of the causal effect among non-droppers
• The sample difference is not an unbiased estimator of causal effects if there exisit
Covariances:
Practical guidline
Difficulty to obtain unbiased estimators
• In real worlds, there basically exist covariance. e.g.,)
• The causal effect of hospital on health status among hospital visiter.
Among hosptial visiters and non-visiters, the potencial health status must be same if they did not visit hospital ⇒⇒⇒ ?
• The causal effect of college drop out on assets among non-dropers.
Among dropppers and non-dropppers, the potencial assets must be same if they did not drop out ⇒⇒⇒ ?
Graphical example
Hospital Causal effect Health
Education Causal effect Income
Randomized control experiment
• To obtain unbiased estimator, the randomized control experiments are often used In the filed of social science, epidemiology, and psychology.
the researcher determine status of treatment
⇒ Subjects in treated and non-treated sub-samples are randomly selected.
⇒ Sub-samples can be suppose as pure-random sampling data from whole subjects.
⇒ The expected value of these sample means must be same as mean of population means.
⇒ The expected value of treated groups must be same as of non-treated groups. Ra do izatio :
Graphical intuition
Population
Sampling
Subjects
Sampling Sampling
Treatments Non-treated
Population
Sampling Sampling
Treatments Controls
Non-treated
Treatments
e.g.,) Pill's performance
• Research interest: The effect of a new pill on the health outcome.
• Design of a randomized experiments: 1. Collecting subjects.
2. Dividing subjects into two groups: Treatment and Control group. 3. Subjects in treatment group take a pill.
4. Comparing the health outcomes between two groups
Question
• True/False question. Suppose the pure random sampling data.
1. If the sample size is totally large, the problem of bias can be reduced.
2. Even if the endogeneity bias is serious, the difference of sample means is an unbiased estimator of the difference of population means.
3. If � � � = < �[� |� = ], the expected value of sample difference
� � � = − �[� |� = ] is lower than the average causal effect. 4. Suppose that average education year is totally different between males and
females. In the case, the sample difference of mean income between males and females is a biased estimator of the causal effects of gender.
Causal effects?
Gender Direct Effects Income
• In above situation, the impacts of gender through changes education year is a part of the causal effect of gender on income. ⇒ Called as ediator .
Conclusion
• The sample difference is an unbiased estimator if and only if the potential outcome are same between groups with different treatment status.
• If you can use the experimental data, you can get the nice estimator of causal effect using only simple statistical method.
• If you cannot use any experimental data, you must use more advanced econometrics technique to reduce the biased problem.
• In the next class, I will introduce a research paper (Banerjee et al, forthcoming, American Economic Journal: Applied, URL http://economics.mit.edu/files/5993).