Slide 15_distribution 最近の更新履歴 Keisuke Kawata's HP

(1)

Econometrics

Regression with a binary outcome

Keisuke Kawata IDEC

(2)

Binary explained variables

• In some case, we would like to estimate the effect on e.g.)

• The effects of class size on drop out.

• The effects of parent income on college graduation.

• The effe ts of the u er of hildre o other’s la or supply.

• The effects of business cycle in home country on migration.

⇒_{How to do?}

(3)

Binary treatments

• Let denote (binary) potential outcome by �_� � (=0 or 1).

• If outcome is binary variable,

• Good estimator of the average difference of probability as �_� = between T=1 and T=o groups is

�� _� �_� = − �� [�_� |�_� = ]

• If hold, the sample difference is BLUE

of the effect of treatments on the probability with _�_� _{� =}

(4)

Continuous treatments: Linear model

• We should use continuous population model

⇒ What’s types odel of o ditio al pro a ility should e applied?

• Linear probability model

⇒We can use the standard OLS technique to get the estimator _�_� which can be interpreted as the effect of treatments on the probability with _�_� _{= .}

⇒Under the least square assumptions, these estimators are unbiased and consistent.

(5)

Problem of linear model: Graphical example

Y

T 1

0

Pr �_� = �_� = � = � + �_��

????????

(6)

Non-linear population model

• We assume the population model as

Pr �_� = �_� = � = where Φ is the cumulative distribution function (c.d.f.). Probit model: Φ is the c.d.f of

Φ � + �_�� =

−∞

�₀+�_�� _{− �}

0^+��^� ²^/

� Logit model: Φ is the c.d.f of

Φ � + �_�� = ₊^�⁰^+�_�₀_+�^�^�_�_�

(7)

Graphical (rough) image

Y

T 1

0

��

The image of logit model is similar to probit model

(8)

Non-linear population model

• How to estimate the coefficients of probit and logit models?

←Because the population model is not additive separable, we cannot standard OLS estimations.

Approach 1) Nonlinear least squares estimation:

The estimators are determined to minimize the sum of squared prediction mistakes:

⇒Consistent but

(9)

Maximum Likelihood Estimation: example

Maximum Likelihood Estimation: Estimators are determined to maximize the

e.g.) Let suppose the population distribution as binary:

= with probability ,

= with probability − Our data set is,

• Let try to estimate using the maximum likelihood estimation.

ID The value of y

1 1

2 0

3 1

(10)

Maximum Likelihood Estimation: example

If we use pure-random sampling data, the probability that our data is drawn is

⇒Called as likelihood function.

Maximum likelihood estimators are defined to maximize the likelihood function as

max × − ×

Then,

(11)

Maximum Likelihood Estimation

Procedure of maximum likelihood estimation

1. Under the assumption about the distribution (probit: normal, logit: logistic), calculating the likelihood as a function of unknown parameters � , … , � . 2. Estimating the unknown parameters to maximize the likelihood function.

Under the assumption about the distribution, maximum likelihood estimators are and have

⇒ than OLS regression.

⇒ We can do the statistical test and construct confidence interval.

(12)

Marginal effects

• The coefficients of probit and logit do not mean marginal effects:

�_� ≠ ^{�� [�|�]}_��

⇒ � is not intuitive.

• Using statistical software, you can calculate marginal effect ← should report in your paper.

(13)

Conclusion: Linear VS Probit VS Logit

• Which type of specification is better? ^⇒ No clear answer.

• You should first get estimators of a linear model as benchmark.

• In many cases, the results of Probit and Logit are not so different.

• To R sessio , let i stall the pa kage mfx .