Slide 11_distribution 最近の更新履歴 Keisuke Kawata's HP

(1)

Home works

• Using BHS 98 data, you should regress the following population model

� _� = �_� ∗ �� + � ∗ ℎ � + � ∗ � ℎ �

• The results of statistical test should be also reported.

• You must submit your script file to me by e-mail.

• The title of e-mail must follow;

Homework2:your student id

• Deadline: July 7, 2015

(2)

Econometrics

Difference in Difference approach

Keisuke Kawata

Hiroshima University

(3)

Remainder: Population mean approach

• Suppose our interest is the effect of T on Y

⇒ We can assume the following population mode:

�_� = � + �_��_� + _�

Only if for any T, we can get the unbiased estimator (�_�^{) from} the OLS regression or the sub-sample mean difference.

• Above assumption does not hold due to

⇒ By introducing , we can reduce the.

• In many cases, we observe all covariates

⇒ Due to these unobservable covariates (omitted variables), bias still remain.

(4)

(Remainder) Panel data

• If you can use panel data, the bias from omitted variables can be reduced.

Panel Data: consist of observations on .

⇔

Cross Section Data: consist of observations on only Time series data: consist of observations on only

(5)

Example: Effect of Microfinance on expenditure

Microfinance status Expenditure

Observable covariates: Education, Household size, age, Village history

Unobservable covariates:

Cognitive and Non-cognitive ability, Family history

Village usi ess le, Lo al weather, Politi al sitituation

(6)

Decomposition of error term

• Generally, the error term can be decomposed by the entity fixed _� ^{and the} date fixed terms, and other term _� ^:

where � is the a e of e tit , is date.

e.g.) The effects of micro-finance on household expenditure

�: Household fixed term ^⇒ Cognitive and non-cognitive abilities, race, family history etc.

: Year fixed term ^⇒ Business cycle, Trade openness, and national average amount of rainfall.

� ⁼

(7)

E.g.) The effect of micro-finance

Two period panel data (period 1 &2): In period 2, a part of households join a microfinance program.

ここに数式を入力します。

Period 1

Period 2

Not join microfinance

Not join microfinance Join micro-finance

(8)

Definition of causal effect

• Let suppose the binary treatment; �_� = �

• Using the decomposition result of error term, the population model can be rewritten as

�_� = � + �_��_� + _� + + _�

• The causal effect can be defined as .

• The sample average in period s can be written as

� �_� �_� = =

(9)

One difference approach: Time series data

Time Series analysis: Using variation, we try to estimate the causal effect.

⇒ Co pare household’s e pe diture i ora ge group.

• The difference of sub-sample means is

� �_� �_� = − � �_� �_� =

=

We can get the unbiased estimator ^⇔

�[ _� |�_� = ] = �[ _� |�_� = ] &

Note: The effect of entity fixed term can be eliminated.

�_� + − + �[ _� |�_� = ] − �[ _� |�_� = ]

(10)

One difference approach: Cross Section Data

Cross Section Data: Using variation, we try to estimate the causal effect.

⇒ Compare the average expenditure in period 2.

• The difference of sub-sample means is

� �_� |�_� = − � �_� |�_� =

=

We can get the unbiased estimator ^⇔

� _� |�_� = = �[ _� |�_� = ]&

Note: The effect of date fixed term can be eliminated.

�_� + �[ _�|�_� = ] − �[ _�|�_� = ] + � _� |�_� = − �[ _� |�_� = ]

(11)

Limitation of one difference approach

Even if the data is pure random sampling and the expected value of other terms dose not depend on treatments , the sample difference may have bias because entity fixed or date fixed effects.

• Cross section analysis can eliminate , but the entity fixed effect still remain.

• Time series analysis: can eliminate , but the date fixed effect still remain.

(12)

Difference-in-Difference approach: First step

Panel Data: Using the variation among both entities and date, we can estimate the

causal effect ^⇒ approach (= Double difference)

First step. Using time-series variation, we calculate the difference of sub-sample means in each group

Orange group=

� Δ�_� �_� = ≡ � �_� �_� = − � �_� �_� =

= �_� + − + �[ _� |�_� = ] − � _� |�_� = Green group=

� Δ�_� �_� = ≡ � �_� − � �_� Treatment group

Control group

(13)

Difference-in-Difference approach: Second step

Second step. Using the cross-section variation, we calculate the DID estimator

� Δ�_� �_� = − � Δ�_� �_� =

=

If_� _� _|�_� _{= =}

⇒_{� Δ�}_� _�_� ₌ _{− � Δ�}_� _�_� ₌ ₌

• Using the DID, we can eliminate bias from

• If such terms are only source of bias, we can obtain of causal effects.

�_� + � _� �_� = − � _� |�_� = − � _� �_� = + � _� |�_� =

(14)

Graphical intuition

DID

(15)

Limitations of DID

• Using DID, we can eliminate the bias from (1) Time-invariant effects

⇒ Cognitive/Non-cognitive, Education history, Born place, race, gender. (2) Time-variant but common effects

⇒ Business Cycle, GDP, change of country institution.

• We cannot eliminate the bias from

(3) ^⇒

e.g.) Good business opportunity: If a household has a good business plan

• They use microfinance.

• Their expected income in period 2 is high. Ti e−varia t a d o −co o effects

(16)

How to check heterogeneous trends?

• There are no formal methods.

⇒ One of casual methods is to check the difference of characteristics before treatments between treatment and control groups.

• If the difference is not so large, the DID assumption is strongly justified.

(17)

Data requirement

• To use DID method, your data must have the following characteristics, Time-series variation: Your data must include the observation

The cross-section variation: Your data must include the observation of

(18)

Question

True/false question. The data should be supposed as pure-random sampling.

1. If there are no heterogeneous time-trends, the DID estimator must be unbiased. 2. Let suppose your research question is the impact of national value added-tax

on household consumption. If you can use household panel-data, you should use the DID estimator to obtain unbiased estimators.

(19)

Conclusion

• Using DID approach, we can eliminate bias from time-invariant effects and/or time variant but common effects.

• If there exits omitted variables which have time-variant and non-common effects, the DID estimators have bias.

• To consider continuous treatments and/or controls, we should use the fixed effect estimation (^⇒In the next class).