Alesina et al (2011)
Alesina et al (2011)
1. What is type of observations?:
2. What is type of data (cross-section, time-series, or panel)?: 3. What are main explanation and explained variables?
4. What is main estimation approach?
Alesina et al (2011)
5. What can we say from Table 1?:
6. What is li itatio of this pape ’s esti atio ou should e plai o etel ? e.g.,) Even if overall crop productivity is constant, the average geoclimatic suitability has impacts on fertility rate through non-agricultural activity (e.g., fishing, farming, and commerce).
Econometrics
Summarize and future topics
Keisuke Kawata IDEC
Process of empirical work
1. Setting research question & data collection 2. Statistical works
3. Interpretation of statistical results
2. Statistical works
• I ou aste thesis, if ou ould like to ite as; effe t of , i pa t of , a d o se ue e of , ou should esti ate the ausal effe ts.
Causal effects: Changes of outcome if treatment status is changed.
←defined by the potential outcome: � �
• In many cases, we should focus on the average effects;
� �′ − [ � � ]
Typical steps
1. Data cleaning
2. Checking descriptive statistics for outcome, treatments, and controls 3. Simple estimators: Sample difference and/or OLS.
4. Advanced estimation;
• DID and/or Fixed effects
• IV regression
• RDD
Simple estimator
• One of simple estimator is the sample difference:
� �′ �� = �′ − [ � � |�� = �]
→In continuous treatments case, you should use the OLS regression:
� �� = � = + ��
Key assumption to identify the causal effect: Two types randomness.
• Random sampling:
• Random treatment: Treatments status are pure-randomly determined.
→ We can obtain unbiased and consistent estimators of interest causal effects.
• If treatment status are not randomly determined
→There may exist covariates which have impacts on both treatments and outcomes
Estimator with control variables
• Estimator of causal effects is � where
� �� = �, � = = + �� +
• If treatment status are pure-randomly determined within each groups � = , we can obtain unbiased estimators.
⇔If all covariate can be controlled, we can obtain unbiased estimators.
Limitation: we cannot observe all covariates (existence of omitted variables).
DID and Fixed effects model
• Estimator of causal effects is � where
�� ��� = �, �� = , �, � = + �� + + � + �
• Within each groups �� = , if there are no heterogeneous time trends, we can obtain unbiased estimators. ⇔If treatment status are pure-randomly changed, we can obtain unbiased estimators.
Limitation: Panel data is needed, we cannot observe all covariates (existence of omitted variables).
IV regression
• Estimator of causal effects is � where
�� � = , � = = + � +
� �� = �, � = = + �� +
• We can obtain unbiased estimators if
– IV has enough impacts on treatments – IV has no direct impacts on outcomes
– Within each groups � = , IV status is pure-randomly determined.
RDD
• Estimator of causal effects is � where
Sharp RDD: � � = , � = = + � + or
Fuzzy RDD: �� � = , � = = + � +
� �� = , � = = + �� +
• We can obtain unbiased estimators if
– T eat e ts status a e dis o ti uousl ju ped at th esholds. – Co a iates a e ot ju ped at th esholds.
⇔In neighbored of thresholds, the value of running variable is pure-randomly determined.
Limitation: Good ju p is eeded.
Critical thinking
• YES/NO question: following statement is true?
1. If estimation results are not consistent with theoretical prediction, estimation method are wrong. ⇒
2. If estimation results are not consistent with theoretical prediction, this theory is wrong. ⇒
• Both theoretical prediction and estimation results crucially depend on assu ptio s .
3. Interpretation
• You should disti guish ide tified a d possi le i te p etatio . e.g., The effect of gender on income; our estimator shows
� � � − � � � = $
a d it’s diffe e e is . % statisti al sig ifi a t
Identified interpretation: The e e ists ge de i e ualit , a d it’s poi t esti ato is 100$.
← We cannot argue about why different?, what factors bring difference?
Possible interpretation: If the average education year of males is larger than of female; the gender inequality may be coming from education inequality.
Statistical work: again
• By using advanced methods, you can obtain more identified interpretations. e.g.,)
Beyond the average effects ⇒ Quantile regression
Be o d the effe ts ⇒ Decomposition analysis, Causal mediation analysis
• We can identify the importance of each causal pass.
Education
Internal and External validity
Internal validity: Our estimator is how close to the population characteristics. External validity: Our estimator can apply for outside of population.
e.g.,) If we estimate the effects of MF on expenditure using Bangladesh survey in 2010;
Internal validity: Our estimator is how close to true effect of MF in Bangladesh 2010 ← Closely related to unbiasedness, consistency, and efficiency.
External validity: Our estimator is how close to an effect in Bangladesh 2016, in India 2015, and in US 2017.
How to ensure external validity?
• O e of fi al pu pose is to o tai i pli atio s fo futu e poli .
⇒We cannot get samples from true interest population
←External validity is totally important.
• Casual way: Using samples from si ila populatio to our true interest.
e.g.,) If true interest is Bangladesh in 2017, samples from Bangladesh is better than from U.S, and samples in 2010 is better in 1990.
Statistical work: again
• One of formal way to ensure external validity is combining mathematical model and statistical works.
Structural estimation: Estimation of deep parameters in the mathematical model and simulation of effects by policy change.
Deep parameters: Parameters a e elati el sta le fo policy change.
1. Setting research question
• What is good esea h uestio ?
⇒ Economics theory tell us the importance to understand tension between Preference: What do you want to know?
Constraints: What can you know?
If your research question is outside of your constraints, you face serious trouble.
• Now you understand the basic Econometrics. ⇒After checking your data, you
Constraints: Simple research question
• If you would like to obtain valid answer about your question, you should set si ple esea h uestio . ⇒ should focus on only one treatments.
Note: In many case, you can easily estimate the impacts on various outcomes.
• If you set multiple treatments, the assumption for unbiasedness became quit demanding.
e.g.) Research question: Comparison of two binary treatments (� , � ), We need following two assumptions
� � , � � � = , � � = � = � � , � � � = , � � = �
� � , � � � = � , � � = = � � , � � � = � , � � =
Preference: Motivation parts
• To write paper, motivation parts is most important and difficult. Motivation: Explanation why people should read the paper.
• Something new must be needed.
• To e plai i po ta e of ou uestio , theo a e useful. Additionally, you should take care following check list;
1. Why your research question is important in your targeting society. 2. Why your targeting society is relevant with your question.
Conclusion
• To find feasible and interest research question, 1. Checking your data,
2. Checking previous literature,
3. (sometimes) bush-up contents of this cource.