Home works
• Using BHS 98 data, you should regress the following population model
� � = �� ∗ �� + � ∗ ℎ � + � ∗ � ℎ �
• The results of statistical test should be also reported.
• You must submit your script file to me by e-mail.
• The title of e-mail must follow;
Homework2:your student id
• Deadline: July 7, 2015
Econometrics
Difference in Difference approach
Keisuke Kawata
Hiroshima University
Remainder: Population mean approach
• Suppose our interest is the effect of T on Y
⇒ We can assume the following population mode:
�� = � + ���� + �
Only if for any T, we can get the unbiased estimator (��) from the OLS regression or the sub-sample mean difference.
• Above assumption does not hold due to
⇒ By introducing , we can reduce the.
• In many cases, we observe all covariates
⇒ Due to these unobservable covariates (omitted variables), bias still remain.
(Remainder) Panel data
• If you can use panel data, the bias from omitted variables can be reduced.
Panel Data: consist of observations on .
⇔
Cross Section Data: consist of observations on only Time series data: consist of observations on only
Example: Effect of Microfinance on expenditure
Microfinance status Expenditure
Observable covariates: Education, Household size, age, Village history
Unobservable covariates:
Cognitive and Non-cognitive ability, Family history
Village usi ess le, Lo al weather, Politi al sitituation
Decomposition of error term
• Generally, the error term can be decomposed by the entity fixed � and the date fixed terms, and other term � :
where � is the a e of e tit , is date.
e.g.) The effects of micro-finance on household expenditure
�: Household fixed term ⇒ Cognitive and non-cognitive abilities, race, family history etc.
: Year fixed term ⇒ Business cycle, Trade openness, and national average amount of rainfall.
� =
E.g.) The effect of micro-finance
Two period panel data (period 1 &2): In period 2, a part of households join a micro- finance program.
ここに数式を入力します。
Period 1
Period 2
Not join micro- finance
Not join micro- finance
Not join micro- finance Join micro-finance
Definition of causal effect
• Let suppose the binary treatment; �� = �
• Using the decomposition result of error term, the population model can be rewritten as
�� = � + ���� + � + + �
• The causal effect can be defined as .
• The sample average in period s can be written as
� �� �� = =
One difference approach: Time series data
Time Series analysis: Using variation, we try to estimate the causal effect.
⇒ Co pare household’s e pe diture i ora ge group.
• The difference of sub-sample means is
� �� �� = − � �� �� =
=
We can get the unbiased estimator ⇔
�[ � |�� = ] = �[ � |�� = ] &
Note: The effect of entity fixed term can be eliminated.
�� + − + �[ � |�� = ] − �[ � |�� = ]
One difference approach: Cross Section Data
Cross Section Data: Using variation, we try to estimate the causal effect.
⇒ Compare the average expenditure in period 2.
• The difference of sub-sample means is
� �� |�� = − � �� |�� =
=
We can get the unbiased estimator ⇔
� � |�� = = �[ � |�� = ]&
Note: The effect of date fixed term can be eliminated.
�� + �[ �|�� = ] − �[ �|�� = ] + � � |�� = − �[ � |�� = ]
Limitation of one difference approach
Even if the data is pure random sampling and the expected value of other terms dose not depend on treatments , the sample difference may have bias because entity fixed or date fixed effects.
• Cross section analysis can eliminate , but the entity fixed effect still remain.
• Time series analysis: can eliminate , but the date fixed effect still remain.
Difference-in-Difference approach: First step
Panel Data: Using the variation among both entities and date, we can estimate the
causal effect ⇒ approach (= Double difference)
First step. Using time-series variation, we calculate the difference of sub-sample means in each group
Orange group=
� Δ�� �� = ≡ � �� �� = − � �� �� =
= �� + − + �[ � |�� = ] − � � |�� = Green group=
� Δ�� �� = ≡ � �� − � �� Treatment group
Control group
Difference-in-Difference approach: Second step
Second step. Using the cross-section variation, we calculate the DID estimator
� Δ�� �� = − � Δ�� �� =
=
If� � |�� = =
⇒� Δ�� �� = − � Δ�� �� = =
• Using the DID, we can eliminate bias from
• If such terms are only source of bias, we can obtain of causal effects.
�� + � � �� = − � � |�� = − � � �� = + � � |�� =
Graphical intuition
DID
Limitations of DID
• Using DID, we can eliminate the bias from (1) Time-invariant effects
⇒ Cognitive/Non-cognitive, Education history, Born place, race, gender. (2) Time-variant but common effects
⇒ Business Cycle, GDP, change of country institution.
• We cannot eliminate the bias from
(3) ⇒
e.g.) Good business opportunity: If a household has a good business plan
• They use microfinance.
• Their expected income in period 2 is high. Ti e−varia t a d o −co o effects
How to check heterogeneous trends?
• There are no formal methods.
⇒ One of casual methods is to check the difference of characteristics before treatments between treatment and control groups.
• If the difference is not so large, the DID assumption is strongly justified.
Data requirement
• To use DID method, your data must have the following characteristics, Time-series variation: Your data must include the observation
The cross-section variation: Your data must include the observation of
Question
True/false question. The data should be supposed as pure-random sampling.
1. If there are no heterogeneous time-trends, the DID estimator must be unbiased. 2. Let suppose your research question is the impact of national value added-tax
on household consumption. If you can use household panel-data, you should use the DID estimator to obtain unbiased estimators.
Conclusion
• Using DID approach, we can eliminate bias from time-invariant effects and/or time variant but common effects.
• If there exits omitted variables which have time-variant and non-common effects, the DID estimators have bias.
• To consider continuous treatments and/or controls, we should use the fixed effect estimation (⇒In the next class).