Slide 3_distribution 最近の更新履歴 Keisuke Kawata's HP

(1)

Additional rule

Student ID Weight(kg)

1 60

2 70

3 80

(2)

Additional rule

Student ID Test score Gender

1 40 male

2 40 female

3 70 male

(3)

Download R and R studio

• In the software excise, we use R (and R studio). R: Statistical software (Free!!!)

R studio: Useful interface for R

• You should download R and R studio.

1. Download R (please download newest version)

see (http://rprogramming.net/how-to-download-r-quickly-and-easily/) 2. Download R studio

(4)

Econometrics: Probability

Keisuke Kawata

Hiroshima University

(5)

Randomness (or Chance)

The phenomena including elements of chance or randomness.

• Rolling a dice to see which number comes out on top

• The gender, high, and weight of the next new person you meet

• To orro ’s eather

• The winner of the next world cup

• GDP growth rate at 10 years ago

• The results of the life-time competition between Pr. Kaneko and Yoshida.

(6)

Structure of statistical works

Population

Sample

Sample: A sub-date of population.

(7)

Randomness of your date

→Given the population, the characteristic of your date is e.g.,) Wages of faculty staff

Population Potential sets of date (sample size=2)

Name Wage

Prof I 1000 Prof K 800 Prof Y 700

ID Wage

1 800

ID Wage 1 1000

2 800

ID Wage

1 800

2 700

ID Wage ID Wage

1 700

ID Wage 1 1000

2 700

(8)

Plan of talks

1. Definition of probability.

2. Key concepts of a single random variable. 3. Key concepts of multiple random variables. 4. Property of the random sampling observation.

5. The asymptotic distribution of sample distribution ← play a central role in the statistics and Econometrics.

(9)

Definition of probability and outcome

• Outcomes: the (mutually exclusive) potential results of a random process.

• The probability of an outcome: the proportion of times of the outcome if the phenomenon with chance were repeated infinitely many times.

Question: What is outcomes of rolling dice?, and probability?

(10)

Note: Discrete and Continuous random outcome

Formal definition

• Random variable

– takes on only a discrete values (the number of dice, the gender of the next new person you meet)

– takes on a continuum values (the high and weight of the new new person you meet, GDP)

(11)

Probability distribution: Discrete random variable

• The probability distribution of a discrete random variable: the list of all outcomes and probability.

• The probability of outcome x is denoted by Pr(x)

e.g.,) The probability distribution of rolling dice

• Sum of the probability of all outcomes must be .

outcomes x

₁

2 3 4 5 6

Probability Pr(x) 1/6 1/6 1/6 1/6 1/6 1/6

(12)

Probability of an event

• Event: A set of outcomes.

e.g.,) The e e t the u er of top of di e is lo er tha = { , }

• The probability of an event = . e.g.,) The probability of a event in which the number of dice is higher than 2 = Pr(3)+Pr(4)+Pr(5)+Pr(6)=4/6=2/3.

(13)

Cumulative probability distribution: Discrete random variable

the random variable is less than or equal to a particular value.

e.g.,) The probability distribution of rolling dice

Outcome 1 1 2 3 4 5 6

Probability distribution 1/6 1/6 1/6 1/6 1/6 1/6

Cumulative probability distribution

(14)

• : probability that the random variable is less than or equal to a particular value.

e.g., ) The cumulative probability distribution of the high of the next new person you meet.

Cumulative probability : Continuous random outcome

Cumulative probability 1

Cumulative probability with continuous outcome

(15)

• We make the list of probabilities of each possible outcomes because outcome is continuous variable._→

e.g., ) The probability density function of the high of the next new person you meet.

Probability distribution: Continuous random outcome

Probability density

(16)

Expected values and variance

• Using only the probability distribution, can we show the characteristics of the random variable? ^⇒

• To grasp the characteristics of a random variable, we often use some important mathematical concepts.

• In this class, we focus on the discrete random variable.

⇒ Definitions of each concepts for a continuous random variable are basically same these of the discrete random variable (If you have an interest, see Stock and Watson

(17)

Expected values (or Mean)

• ^： the long-run average value of the random variable over repeated trials.

(Note) Expected value = expectation = mean

(18)

Quation

Probability distribuion

What is the expected value?

Outcomes 0 1 2

Probaibility 0.1 0.8 0.1

(19)

Example: The limitation of expected value

• The random variable A

• The random variable B

• Expected values of random variables A and B are

Outcomes 0 1 2

same.

(20)

Standard Deviation and Variance

• To easure the dispersio or the spread of a pro a ilit distri utio , e ofte use the and the .

(21)

Standard Deviation and Variance

(22)

Example: rolling the dice

• The probability distribution

• What’s the e pe ted alue?

• What’s the aria e?

outcomes

₁

2 3 4 5 6

Probability 1/6 1/6 1/6 1/6 1/6 1/6

(23)

Key distribution: Normal distribution

0.15 0.2 0.25 0.3 0.35 0.4 0.45

(24)

Probabilistic Property of data

What probabilistic relationship between population and simple random sampling date ?

Important assumption ^⇒

: Each member of the population is equally likely to be included in the sample.

(25)

Sampling distribution of the sample average

(26)

Example

Name Wage

Prof I 1000 Prof K 800 Prof Y 600

ID Wage

1 800

ID Wage 1 1000

2 800

ID Wage

1 800

2 600

ID Wage

1 600

ID Wage

1 600

ID Wage 1 1000

2 600

Mean: Mean: ^Mean:

Mean:

(27)

Large-Sample Approximation

• There are two approaches to advanced characterization of sampling distribution: Exact distribution approach: deriving a formula for the sampling distribution that holds exactly for any value of sample size n ^⇒

: using approximations of the sample distribution in infinity large sample size case (n ^⇒_∞).^⇒ We use this approach.

Two key theorem: the law of large number, and the central limit theorem.

(28)

Consistency

(29)

Law of large number

• Combining Theorem 1 and consistency shows the following theorem. Theorem 2: Sample mean is equal to the mean in population with very high probability when sample size is infinity large.

(30)

Central Limit Theorem

(31)

Quiz

• True/False question.

Suppose the pure random sampling data.

1. If the sample size is totally large, the distribution of a value of a observation converge to the normal distribution.

2. The sample mean equals to the population mean.

3. If the population mean and variance are known, we can calculate the probability that the sample mean in large sample size is between 0 to 1.

(32)

Conclusion

• Given population, the characteristic of your date is random variable .

→ If your interest is the characteristics of population, we must first study the probabilistic relationship between population and samples.

• If your date is pure random sampling date,

• The expected value of sample mean is equal to population mean, but sample means still have a positive variance.

• If the sample size is enough large, the distribution of sample means can be approximated as the normal distribution (Central Limit Theorem)