Mem. Fae. Educ., Kagawa Univ. D, 53(2003), 73-83
On the Evaluations of Sum of Squares by Using the Range II
by
Toshihiko MENDORI, Y oshihisa ZINNO and Hiroo FUKAISHI
(Received July 18, 2003)
Abstract
As an application of the main theorem of our previous paper to frequency and probability distributions of discrete type, we shall give an estimation of the g.l.b. and the l.u.b. of the sum of squares by the range R of the values. Moreover, we extend the theorem to probability dis- tributions of continuous type and the variance cl of a distribution is given by 0 < c,2 ~ R2 I 4.
§
1. Introduction
In the previous paper [ 4] we gave the main theorem of the evaluation of sum of squares by using the data range and geometric representations. In this paper we shall extend our study of evaluation of sum of squares for data with a frequency dis- tribution ( § 2) and for a probability distribution ( § 3).
Briefly we recall the notations and the main theorem.
Let { xl' x2 , ... , xn}, n > 1, be a data set of n values. Define the data range R and the mean
x
of the data as follows :R= maxxi-mm xi
l$i$n l$i$n '
1 n
x=-Ixj,
ni=I
respectively. Define the variance
s2
of the data and the unbiased estimate u2 of thepopulation variance as follows :
1 n _ 2
s
2=-I(x;-x),
n i=l
2
1 ~(-)2
U = - - ~ X;-X ,
n-} i=l
respectively.
Now define the function
g(x)
by the following:n 2
g(x)
=I(x;-x) .
i=l
Then the function g (
x)
takes the minimum atx
=x g(x) =
ns
2= (n-1)
u2•Theorem 1 (Main Theorem [ 4; Theorem 2.1] ) . The minimum of the sum of squares g (
x)
is evaluatedby
using the data range R as follows :_!_R2 ~ g(x) ~ {n _ l-(-lt}R2.
2 4 8n ·
In the proof of Theorem 1 the following are shown [ 4] . ( 1) The function g
(x)
takes the minimum at the case ofwhere R = Xn -X1.
(2) The function
g(x)
takes the maximum at the case of( Xp Xv ... , Xn-1' Xn) = ( X1 , ••• , Xp Xn, ••. , Xn):
'---v---' '---v---'
or at each of the cases
n/2 n/2
max
g(x)
= n R2, if n even,x1Sx2S ... Sxn 4
(xl'
X2' ... , Xn-1' Xn) =(x
1 , ... , XI' Xn, ... , Xn):'---v---' '---v---'
(n-1)/2 (n+l)/2
and
On the Evaluations of Sum of Squares by Using the Range II
( Xp x2, ... , Xn-1' Xn ) = ( X1 , ••• , Xp Xn, ... , Xn ) :
~~
(n+l)/2 (n-1)/2
- n2-l 2 max
g(x)
= - - R ,XJ :S:x2 :S: ... :S:Xn 4n if n odd.
§ 2. The sum of squares for frequency distributions
Let
{tP t 2, ... , tm },
m > l, be a set of m class mark values satisfying the follow- ing conditions :i ) (;+i
-t;
= Rl(m-1) for 1 ~ i ~ m-1,where R is the range of the values : R
= tm - t
1 > 0,ii) the frequency J; of the data with the value (; is positive for both extremes :
/i
> 0,J,n
> 0, andm
iii)
If;
= n.i=l
Then the mean and the variance of the data are given by
z
=~f.tJ;,
s2 =~f (t;-zf
f;,n i=I n i=I respectively, and we have that
From Theorem 1, the minimum of
g ( x)
is given byg (~)
=n s2.
Theorem 2. The sum of squares g (~) for a frequency distribution is evaluated by the range R of the values as follows :
{~+ l+(-1):. n-2}R 2
~ g([) ~ {n
l-(-1r}R2.2 4( m -1) n 4 Sn
Proof. I. The minimum of g (
x) .
Thereom 1 suggests that g (~) takes the minimum at
x. = (i + ( m for 2
~
i~
n -1.I 2
If m is odd, the sum of squares g(~) takes the minimum for the frequency distribution
'i
?1 ?2 ... ((m-1)/2 ((m+l)/2 ((m+3)/2 ... 'm-1<;m
totalh
1 0...
0 n-2 0 ...
0 1 nh 7 ?1
+c;m
ThW ere S(m+l)/ 2 = . en 2
<;
= ?1+c;m
and g([) = .!_R2.2 2
If m is even, in order that g (~) takes the mm1mum when xi achieves the nearest value from
(?
1+?m)/2
for each of i, 2~i~m-1. Therefore,g(~)
takesthe minimum for the frequency distributions
'i
?1 ?2...
(m/2-1 (m/2 (m/2+1 ...'m-1 <;m
totalh
1 0...
0 n-2 0 . .. 0 1 nh ;c (1 + (m
W ere Sm/ 2 = 2
1 R, and 2(m-1)
'i
?1 ?2...
(m/2 (m/2+1 (m/2+2 ...'m-1 <;m
totalh
1 0 ... 0 n-2 0...
0 1 nh :c ?1 +
?m
1 R ThW ere Sm/ 2+1 = __;;_---'-+--- . en 2 2(m-1)
"[ = (1 +
tm +
n - 2 R2 2(m-1)n
and
g
(Z) =
_!_ R2 + n - 2 R2 . 2 2(m-I)2nII. The maximum of
g(x).
If n is even, the sum of squares g(~) takes the maximum for the frequency distribution
'i
?1 ?2 ...'m-1 <;m
total J; n/2 0...
0 n/2 nOn the Evaluations of Slim bf Squares by Using the Range II
1.e.,
y ;: 1 R d g(Y) = _!_R2.
s = 1:,1
+-
an s2 4
If n is odd, in order that g (~) takes the maximum, the frequency distributions should be the following :
'i
?1 ?2...
<;m-1 <;m totalJ; (n±l)/2 0
...
0 (n +1)/2 n Le.,•
§
3. The sum of squares for probability distributions
A relative frequency distribution can be considered as a probability distribution with rational occurence ratios. So, as the second step, we proceed to the evaluation of sum of squares of a probability distribution.
From Theorem 2 we immediately know that
{_1 + 1 + (-l)m . n-2}R 2 $ s2 $
{_!_-
1-(-lt }R2.2n 4( m - 1) 2 n 2 4 Sn 2
Since each of pi
= !;
In is rational for any integer i, 1 $ i $ m, the sum of squares s2 =satisfies
3.1. The sum of squares for discrete probability distributions.
Let
{?i, <;
2, ... ,?m },
m >I, be a set of m values, satisfying the following conditions : for I $ i $ m - I ,where R is the range of the values: R
= c;m - c;
1 > 0,ii ) the real occurrence probability Pi of the variable with the value
c;i
ispositive for both extremes : p1 > 0, Pm > 0, and
m
iii)
I
Pi = 1.i=l
Then the mean µ and the variance (J2 of the data are given by
i=l
respectively; and we define that
g(x)
Then the minimum of g (
x)
is given by g(µ) =
(J2, as in Theorem 1.Theorem 3. The sum of squares of deviation from the mean g
(µ)
for a descrete probability distribution is evaluated by the range R of the values as follows :0 <
g(µ)
$;_!_R
2•4
Proof Setting we have
t.e.,
for 1 $; i $; m and k
=
1, 2, ....Let us set nk, µk and (Jk by the following :
m 1 m
nk = Lfi.k, µk = -'I?Ji,k and (Jk 2 =
i=l nk i=l
Then the inequality
m m m
LPi -rkm < 2-kLfi,k ~ LPi
i=l i=I i=l
implies
Following the inequality
m m
µ-2-k
Iti
< 2-k'I?Ji,k $; µ,i=I i=I
we have
On the Evaluations of Sum of Squares by Using the Range II
which implies
Similarly we have
which implies
From Theorem 1, we have
whence
•
3.2. The sum of squares for continuous probability distributions.
Let p ( x) be a bounded density function of xe [a, b] , satisfying the conditions : i )
{ p(x)
2:: 0 for xe [a, b],p(x)
=
0 otherwise, andii )
f
00p (x )dx=
l.Then the mean and the variance of x are
µ
=[xp(x)dx
anda'= [(x-µ)
2p(x)dx,
respectively. Define the function g (
t)
by the following g(t)= f_~
(t-x)2
p(x )dx.Then the minimum of g (
t)
is given by g(µ) =
a 2•Theorem 4. The sum of squares of deviation from the mean g
(µ)
for a continuous probability distribution is evaluated by the range R of x as follows :0 < g
(µ) :::; .!_
R 2• 4Proof For an integer k ~ I, let xi and pi be as follows : xi
=
a+(i-l)~x,pi
=
p ( xi) for . 1::;; i::;; m, where ~x=
R 12 k, m=
2 k + I . ThenJ~tp(x;)Llx = [ p(
x) dx= 1 .
Let us use the similar notations in Theorem 3 :
2k 1 2k
fi,k =
[2kpi],
nk=
~fi,k, µk=
-I.xJi,k1=1 llk i=I
and
Then, the inequality
i=l i=l
implies
2k
lim Tkn k R 12 k
=
limI.
pi R 12 k=
I .k• oo k• oo i=I
From the inequality
2k 2k 2k 2k
I_x;p;Rl2k -2-kI.xiR/2k < 2-kI.xJ;,kR/2k ::;; I_xipiR/2~
i=l i=l i=l i=l
we have
which implies
lim µk
=
k• oo
On the Evaluations of Suin of Squares by Using the Range II
Similarly we have
2k 2k 2k 2k
I(xi -µ)2 PiR/2k -TkI(xi -µ)2R/2k rkI(xi -µ)2 fi,kR/2k I(xi -µ)2 PiR/2k
~i=~'---~'~·=l~----<--i~_l _ _ _ _ _ _ <~i=_I _ _ _ _ _ _
2k 2-knkR/2k 2k
LPiR/2k LP;Rl2k -R/2k
~ ~
which implies
From Theorem 1, it follows
Therefore, we have
•
Let us give some examples.
Example 1. Define p (
x)
be the density function of a distribution by the following : n+l (r
- - a+x
2an+I for XE [ -a, o]'
p(x) = - - a-x n+l (
r
2an+l for XE [o,
a],
0 otherwise,
where n is a non-negative integer. Then the mean µ and the variance (J2 are given by
2 2 2
µ
=
0 and (J=
- - - a=
(n+2)(n+3)
1R2.
2(n+2)(n+3)
Example 2. Define p (
x)
be the density function of a distribution by the following : n+l (-xf2an+I for XE [ -a,
o]' p(x) = - - x
2an+I n+l n for X E[0, a],
0 otherwise,
where n is a non-negative integer. Then the mean µ and the variance a2 are given by
µ=0 and
In [1] C. F. Gauss explained the probability density function in Example 1 for n = 0, 1.
References
[ 1] C. F. Gauss: Theoria combinationis observationum erroribus minimis obnoxiae pars prior, Commentationes Societatis Regiae Scientiarum Gottingensis Recentiores, 5 ( 1823) .
(in the book·c. F. Gauss: Gosa-ron, edited and translated into Japanese by T. Hida and T.
Ishikawa, Kinokuniya, Tokyo, 1981.)
[2] M. J. R. Healy: Matrices for Statistics, Oxford University Press, New York, 1986.
[3] R. V. Hogg and A. T. Craig: Introduction to Mathematical Statistics, 4th ed., Collier Macmillan International Editions, 1978.
[4] T. Mendori, Y. Zinno and H. Fukaishi: On the evaluations of sum of squares by using the range, Mem. Fae. Educ., Kagawa Univ. II, 53(2003), 1-12.
[5] A. M. Mood, F. A. Graybill and D. C. Boes: Introduction to the Theory of Statistics, 3rd ed., McGraw-Hill, 1974.
[6] G. W. Snedecor, W. G. Cochran and D. F. Cox: Statistical Methods, 8th ed., The Iowa State University Press, Ames, 1989.
. .
On the Evaluations of Sum of Squares by Using the Range II
Toshihiko MENDORI
Emeritus Professor of Kagawa University
1388-3 Nii, Kokubunji-cho, Kagawa, 769-0101, JAPAN E-mail address : mendori@hkg.odn.ne.jp
Yoshihisa ZINNO
1963-2 Takamatsu-cho, Takamatsu-shi, 761-0104, JAPAN
Hiroo FUKAISHI
Department of Mathematics, Faculty of Education, Kagawa University 1-1 Saiwai-cho, Takamatsu-shi, Kagawa, 760-8522, JAPAN
E-mail address : fukaishi@ed.kagawa-u.ac.jp