DisriminantAnalysisforMultivariateNon-GaussianLo ally Stationary
Pro esses
Junihi HIRUKAWA
Abstrat. Theextensionoflassialdisriminant analysistehniquesinmultivariate
analysistotimeseriesdataisaproblemofpratialinterest. Disriminati on b etween
dierentlassesofmultivariatelo allystationarypro esses,whihonstitutealassof
non-stationarypro esses,anb eharaterizedbydieringovariane ortimevarying
sp etralstrutures. Fordisriminati on b etweenthemultivariatenon-Gaussian lo ally
stationary pro esses, Kullbak-Leibl er disriminatio n information measure has b een
develop ed. Inthispap er, asymptotierror rates andlimiting distributions aregiven
forageneralized time varying sp etral disparity measurethat inludes foregoing ri-
teriaas sp eialase. Itiswell knownthatthelog-likel ih o o d ratiobased onobserved
strethgivesoptimal lassiation. Itisshownthat thedisriminant riterion based
onsuhgeneralized disparity measure is Gaussianoptimal. A non-Gaussian optimal
disriminant riterion isalsoprop osedinviewoftheLANtheorem.
1. Intro dution. Inmultivariateanalysis,manymetho dsofdisriminantanalysishave
b een investigated indetail(e.g. Anderson, 1984). Theextension of lassialdisriminant
analysis tehniques inmultivariate analysis to time series data is a problem of pratial
interest. Shumway(1982) gave an extensive review of various disriminant problems in
timeseries. Zhangand Taniguhi(1994)disussed the parametridisriminantproblems
for non-Gaussianvetor linearpro esses, and showedthat disriminantriterionbased on
an integral funtionalof p erio dogramhas somego o dprop erties, forexample, asymptoti
normalityandnon-Gaussianrobustness, et. Zhangand Taniguhi(1995)have shown ro-
bustness ofChernoinformationmeasure top eakontaminationinsp etra ofthepro ess
onerned. Fordisriminationb etween non-Gaussianmultivariatetimeseries,Kakizawaet
al. (1998) have intro dued adisparity measure, whih inludes the Kullbak-Leibler dis-
riminationinformation and the Cherno informationmeasure, and gave appliations to
theproblemsoflassifyingearthquakes andminingexplosions.
Althoughtheanalysis forstationarytimeseriesiswellestablished,stationary timeseries
mo delsarenotplausibletodesrib e therealworld. Dahlhaus(1996a,1996b,1996,1997)
has intro dued an imp ortant lass of non-stationary pro esses, alled lo ally stationary
pro esses, and established the asymptoti theory of statistial inferene. Disrimination
b etween dierent lasses of multivariate time series that an b e haraterized by dier-
ing ovariane or timevarying strutures is imp ortantinappliations of o urring inthe
analysis of seismi reords and biometris data. Sakiyamaand Taniguhi(2004) investi-
gated the problems of lassifying a multivariate non-Gaussian lo ally stationary pro ess
fX
t;T
gintoone oftwoategoriesdesrib ed bytwohyp otheses
i :f
i
(u;),i=1;2,where
f
i
(u;) are time varying sp etral density matries. They used an approximationof the
Gaussian Kullbak-Leibler information measure as a lassiation statisti for this prob-
lem and showed that this statisti is onsistent. The mislassiation probabilities were
also evaluatedunder ontiguoushyp otheses. In thispap er, we generalize this measure to
2000MathematisSubjetClassiation. 62M10;62M15;(62H30).
Keywordsandphrases. Lo allystationaryvetorpro ess;Disriminantanalysis;Cherno;Kullbak-
Leibler;Non-Gaussianrobust;Peakrobustness;Non-Gaussianoptimal.
non-lineartimevaryingsp etralmeasures whihinlude theKullbuk-Leibler information
and the Cerno informationmeasure. We also prop ose a genuine non-Gaussian optimal
disriminationriterionbased onanotherapproah.
Thetimeseriesdatareo dedinrealphenomenasuh asseismireord andnanialtime
series, are oftennon-stationary and non-Gaussian. Toinvestigatethe atualp erformane
ofourdisriminationriterion tosuh multivariatenon-stationary andnon-Gaussiantime
series data willb e inreasing imp ortane. However, this problemrequires another pap er.
We willmakeitasafuturework.
Thispap er isorganized as follows. In Setion2 we dene themultivariatenon-Gaussian
lo allystationary pro esses, andintro due anonparametri timevaryingsp etral density
estimator, whih is due to Dahlhaus (1996a,1996b, 1997). Setion 3gives a generalized
measure of disparitywhih inludes Kullbuk-Leibler and Cherno informationmeasure.
InSetion4,wederive thelimitingdistributionsandasymptotierrorrates ofourdisrim-
inantstatistis. Wealsodisussonditionsfornon-Gaussianrobustness,andshowthatour
disriminantriterionisGaussianoptimal.Peakrobustnessofourdisriminationriterion
is studied, and some numerial examples are given. In Setion 5, we prop ose a genuine
non-Gaussianoptimaldisriminationriterionbased ontheLANprop erty.
2. Non parametri sp etral estimator of multivariate lo ally stationary pro-
esses. Whenwe dealwithnonstationarypro esses,oneofthediÆultproblemstosolve
is how to set up an adequate asymptoti theory. Tomeet this Dahlhaus (1996a, 1996b,
1997) intro dued an imp ortantlass of nonstationary pro esses and develop edthe statis-
tial inferene. We give the preise denition of multivariate lo allystationary pro esses
whih isduetoDahlhaus(2000).
Denition1. A sequene of multivariatestohastiproesses X
t;T
=(X (1)
t;T
;:::;X (m)
t;T )
0
,
(t = 2 N =2;:::;1;:::;T;:::;T +N =2;T;N 1) is alled loally stationary with mean
vetor0and transferfuntion matrixA Æ
ifthere existsarepresentation
X
t;T
= Z
exp(it) A Æ
t;T
()d();
(1)
where
(i) ()=(
1
();:::;
m ())
0
isaomplexvaluedstohastivetor proesson [ ; ℄ with
a ()=
a
( )and
umfd
a1 (
1
);:::;d
ak (
k )g= (
k
X
j=1
j )
a1;:::;ak
(2 ) k 1
d
1 :::d
k 1
; (2)
fork2,a
1
;:::;a
k
=1;:::;m,whereumf:::gdenotes theumulant ofk -thorder,
and ()= P
1
j= 1
Æ(+2 j)isthe period2 extension ofthe Dira deltafuntion.
(ii) There existsa onstant K and a2 -periodi matrix valued funtion A: [0;1℄R!
C m m
withA(u; )=A(u;)and
sup
t;
A
Æ
t;T ()
a;b A
t
T
;
a;b
KT
1
(3)
foralla;b=1;:::;m and T 2N. A(u;) isassumedtobeontinuous in u.
f(u;) := A(u;)A(u;)
is alled the time varying sp etral density matrix of the
pro ess, where =(
a;b )
a;b=1;:::;m and D
denotes the omplexonjugate of matrix D.
Write
"
t :=
Z
exp (it)d();
(4)
thenE("
t
)=0,E("
t
"
0
t
)=andE("
t
"
0
s
)fort6=sisazero matrix.Wemakethefollowing
assumption.
Assumption1. X
t;T
hasthe MA(1)representation
X
t;T
= 1
X
s= 1 a
t;T (s)"
t s
; (5)
that is,
A Æ
t;T ()=
1
X
s= 1 a
t;T
(s)exp ( is);
(6)
where theoeÆientsfulll
sup
t 1
X
s= 1
a
t;T (s) a
s
t
T
d
=O (T 1
) (7)
for all ;d= 1;:::m, with ontinuous matrix funtion a
s
(u). Then, the ondition (3) is
fullledfor
A(u;)= 1
X
s= 1 a
s
(u)exp( is):
(8)
Furthermorewe makethefollowingassumptiononthetransfer funtionmatrixA(u;).
Assumption2. (i) The transferfuntion matrixA(u;) is2 -periodiin and the pe-
riodiextension istwiedierentiablein uand withuniformlyboundedontinuous
derivatives
2
u 2
A,
2
2 A and
2
u 2
A.
(ii) All the eigenvalues of f(u;) are bounded from below and above by some onstants
Æ
1
;Æ
2
>0uniformlyinuand .
Asanestimatoroff(u;),weusethenonparametriestimatorofkerneltyp e denedby
b
f
T
(u;)= Z
W
T
( )I
N
(u;)d;
(9)
whereW
T
(! )=M P
1
= 1
W(M(!+2 ))istheweightfuntionandM >0dep endson
T, and I
N
(u;) isthedata tap ered p erio dogrammatrixover thesegmentf[uT℄ N =2+
1;[uT℄+N =2gdened as
I
N
(u;)= 1
2 H
2;N (
N
X
s=1 h
s
N
X
[uT℄ N=2+s;T
expfisg )
(
N
X
h
r
N
X
[uT℄ N=2+r;T
exp fir g )
: (10)
Here h : [0;1℄ ! Ris a data tap er and H
2;N
= P
N
s=1 h
s
N
2
. It should b e noted that
I
N
(u;)isnotaonsistentestimatorofthesp etraldensity. Tomakeaonsistentestimator
off(u;)we havetosmo othitoverneighb ouring frequenies.
Nowweimp osethefollowingassumptionsonW()andh().
Assumption 3. The weighted funtion W : R! [0;1℄ satises W(x) = 0 for x 2=
[ 1=2;1=2℄,and is aontinuous and evenfuntion satisfying R
1=2
1=2
W(x)dx=1and
R
1=2
1=2 x
2
W(x)dx<1.
Assumption 4. The data taper h : R! Rsatises (i) h(x) =0 forall x2= [0;1℄and
h(x)=h(1 x),(ii)h(x) isontinuous on R,twie dierentiable atallx2=U whereU is
anite setofR,and sup
x=2U jh
00
(x)j<1. Write
K
t (x):=
Z
1
0 h(x)
2
dx
1
h(x+1=2) 2
; x2[ 1=2;1=2℄;
(11)
whihplays arole of kernelin thetimedomain.
Furthermore,we assume
Assumption5. M =M(T)and N=N(T), M N T satisfy
p
T
M 2
=o(1);
N 2
T 3
2
=o(1);
p
TlogN
N
=o(1):
(12)
The followinglemmasare multivariateversion of Theorem 2.2 of Dahlhaus (1996) and
TheoremA.2ofDahlhaus(1997)(SeealsoSakiyamaandTaniguhi(2003)).
Lemma 1. Assume thatAssumptions 1-5hold. Then
(i)
E(I
N
(u;))=f(u;)+ N
2
2T 2
Z
1=2
1=2 x
2
K
t (x)
2
dx
2
u 2
f(u;)
+o
N 2
T 2
+O
logN
N
; (13)
(ii)
E(
b
f(u;))=f(u;)+ N
2
2T 2
Z
1=2
1=2 x
2
K
t (x)
2
dx
2
u 2
f(u;)
+ 1
2M 2
Z
1=2
1=2 x
2
W(x) 2
dx
2
2
f(u;)
+o
N 2
T 2
+M 2
+O
logN
N
; (14)
(iii)
m
X
i;j=1 Var
b
f
i;j (u;)
= M
N m
X
i;j=1 f
i;j (u;)
2 Z
1=2
1=2 K
t (x)
2
dx
Z
1=2
1=2 W(x)
2
dx(2+2 f0mod g)+o
M
N
: (15)
Hene,wehave
E
b
f(u;) f(u;)
2
=O
M
N
+O M 2
+N 2
T 2
2
=O
M
N
; (16)
where kAkisthe Eulideannormof the matrixA;kAk=ftr fAA
gg 1=2
.
Lemma 2. Assume that Assumptions 1-5 hold. Let
j
(u;), j = 1;:::;k be mm
matrix-valuedontinuous funtionon [0;1℄[ ; ℄whihsatisesthe sameonditionsas
thetransferfuntionmatrixA(u;)inAssumption2and
j (u;)
=
j
(u;),
j
(u; )=
j (u;)
. Then
L
T (
j )=
p
T (
1
T T
X
t=1 Z
tr
j
t
T
;
I
N
t
T
;
d
Z
1
0 Z
trf
j
(u;)f(u;)gddu )
; j =1;:::;k (17)
have, asymptotially,anormaldistributionwithzeromeanvetorandovarianematrixV
whose (i;j)-the elementis
4 Z
1
0
"
Z
tr f
i
(u;)f(u;)
j
(u;)f(u;)gd
+ 1
4 2
X
a1;a2;a3;a4 X
b
1
;b
2
;b
3
;b
4
b1;b2;b3;b4 Z
Z
i (u;)
a1;a2
j (u;)
a4;a3
A(u;)
a
2
;b
1
A(u; )
a
1
;b
2
A(u; )
a
4
;b
3 A(u;)
a
3
;b
4 dd
#
du:
(18)
Assumption 5do es notoinide with Assumption A.1 (ii) of Dahlhaus(1997). As men-
tioned in A.3 Remarks of Dahlhaus (1997), Assumption A.1 (ii) of Dahlhaus (1997) is
required b eause of the p
T-unbiasedness at the b oundary 0 and 1. If we assume that
fX
2 N=2;T
;:::;X
0;T
gandfX
T+1;T
;:::;X
T+N=2;T
gareavailablewithAssumption5,then
fromLemma1(i)
E(L
T (
j ))=
p
TE (
1
T T
X
t=1 Z
tr
j
t
T
;
I
N
t
T
;
d
Z
1
0 Z
trf
j
(u;)f(u;)gddu )
=O
p
T
N 2
T 2
+ logN
N +
1
T
=o(1): (19)
3. Measures of disparity. We supp ose that we have a olletion of zero-mean m-
dimensionalvetorlo allystationary timeseriesX
t;T
=(X (1)
t;T
;X (2)
t;T
;:::;X (m)
t;T )
0
;t=
1;2;:::;T. Denote by p
i
(x), i = 1;2, the probability density funtions of the mT 1
vetor x=(X 0
1;T
;X 0
2;T
;:::;X 0
T;T )
0
under twohyp otheses
i
, i=1;2,resp etively. Inthe
ase of lo allystationary pro esses,
i
, i = 1;2 are, resp etively, desrib ed by the time
varying sp etral density matries f
i
(u;), i = 1;2 orresp onding to mT mT matries
T (p
i
), i=1;2. Although theory develop ed later transends theusual normal theory, it
isonvenientto usethenormalassumptiontemp orarilytomotivatemeasuresofdisparity
b etween thedensities p
i
(),i=1;2.
Onelassialmeasureofdisparityb etweentwomultivariatedensitiesistheKullbakLeibler
(KL) disriminationinformation,denedby
K(p
j
;p
k )=E
p
log p
j (x)
p
k (x)
; (20)
whereE
p
denotestheexp etation underthedensityp(). TheKLdisriminationinforma-
tiontakestheform
K(p
j
;p
k )=
1
2
tr f
T (p
j )
T (p
k )
1
g log j
T (p
j )j
j
T (p
k )j
mT
(21)
when p
i
(x) orresp ond to twohyp othetial zero-mean multivariate normal distributions.
ThemTmT ovarianematries
T (p
i
)ontainthemmmatries T
s;t (p
i
),s;t=1;:::;T
asblo ks,where
T
s;t (p
i )=
1
2 Z
exp(i(s t))A Æ
s;T ()A
Æ
t;T ( )
0
d:
(22)
Parzen(1990)prop osed tousetheCherno(CH)informationmeasure
B
(p
j
;p
k
)= logE
p
j
p
j (x)
p
k (x)
; (23)
as a measure of disparityb etween the twodensities, where the measure is indexed by ,
0 < < 1. For = 1
2
, the Cherno information measure is the symmetri divergene
measure. For twonormal random vetors diering only in the ovariane struture, the
measure(23)takestheform
B
(p
j
;p
k )=
1
2
log j
T (p
j
)+(1 )
T (p
k )j
j
T (p
k )j
log j
T (p
j )j
j
T (p
k )j
: (24)
It is of interest to note the antisymmetry prop erty B
(p
j
;p
k ) = B
1 (p
k
;p
j
) and that
B
(p
j
;p
k
),saledby(1 )onvergestoK(p
j
;p
k
)for!0andtoK(p
k
;p
j
)for!1.
HenetheCernomeasureb ehaveslikethetwoKullbak-Leiblermeasuresforvaluesofthe
parameterthatarenear theb oundaries0and1.
Itshouldb ereognizedthattheinformationmeasures(21)and(24)b othinvolvemTmT
matrieswhosedimensionstendtoinnityasT!1. Asintheaseofstationarypro esses,
it is naturalto use sp etral approximationsinterms of thetimevarying sp etral density
matriesf
i
(u;),i=1;2. Theappropriateversionsof(21)and(24)are
K(f
j
;f
k
)= lim
T!1 T
1
K(p
j
;p
k )
= 1
2 Z
1 Z
tr ff
j (u;)f
1
k
(u;)g log jf
j (u;)j
jf
k (u;)j
m
d
2 du (25)
and
B
(f
j
;f
k
)= lim
T!1 T
1
B
(p
j
;p
k )
= 1
2 Z
1
0 Z
log jf
j
(u;)+(1 )f
k (u;)j
jf
k (u;)j
log jf
j (u;)j
jf
k (u;)j
d
2 du:
(26)
Note herethat thetime-varyingsp etral matriesf
i
(u;) orresp ond to themultivariate
densities p
i
(x). Theadvantageof (25)and(26)is that theevaluationproblemisredued
toinvertingmmmatries. Bothforms(25)and(26)arefuntionsofthematrixpro dut
f
j (u;)f
1
k
(u;)andanb e generalizedtothefollowingdisparitymeasure
D
H (f
j
;f
k )=
1
2 Z
1
0 Z
H(f
j (u;)f
1
k
(u;)) d
2 du (27)
whereH()issomematrix-valuedfuntion. ToensurethatD
H (f
j
;f
k
)hasthequasi-distane
prop erty, we require D
H (f
j
;f
k
) 0, and that the equality holds if and only if f
j
= f
k
almost everywhere. The funtion H(Z) must have a unique minimum at Z = E
m , the
identitymatrix. There are manyp ossible hoies of H(Z) suh that D
H
(;)satises the
quasi-distaneprop erty,butwe onsideronlythetwoasesorresp ondingto(25)and(26):
H
K
(Z)=tr fZg logjZj m (28)
and
H
B
(Z)=logjZ+(1 )E
m
j logjZj:
(29)
Notethat anotherp ossiblehoieisthequadratifuntion
H
Q (Z)=
1
2
tr(Z E
m )
2
: (30)
Generally,D
H
(;)isnotsymmetributaneasilyb e madesobydening
e
H(Z)=H(Z)+H(Z 1
):
(31)
The general form (27)an b e approximatedby sums over frequenies of the form
s
=
2 s=T andtimeu
t
=t=T,s;t=1;2;:::T,i.e.,
D
H (f
j
;f
k )
1
2 T
2 T
X
s;t=1 H f
j (u
t
;
s )f
1
k (u
t
;
s )
: (32)
4. Disriminant analysis. Supp ose that we wish to investigatethe problem of las-
sifyingarealizationX
T
=(X 0
2 N=2;T
;:::;X 0
1;T
;:::;X 0
T;T
;:::;X 0
T+N=2;T )
0
into oneof two
knownategories
j
, j =1;2,where
j
is desrib ed by thetimevarying sp etral density
matrixf
j
(u;). Let b
f
T
(u;)b e thenonparametritimevaryingsp etral densityestimator
given by (9), whih is based on observation to b e lassied. We measure the disparity
b etween the samplesp etrum of X
T
and ategory
j by D
H (
b
f
T
;f
j
). Then theprop osed
rule is to lassify X
T
into
1 or
2
aording to D
H
> 0or D
H
0, where D
H is the
disriminantfuntiondenedby
D
H
=D
H (
b
f
T
;f
2 ) D
H (
b
f
T
;f
1 ):
(33)
Inthissetionweexaminetheasymptotiprop ertiesofdisriminantfuntion(33). Assume
thattheategory
j
isanm-variatelinearpro ess oftheformX
t;T
= P
1
a (j)
(k )"
t k ,