3.3 Category quantification method
3.3.1 NPCA-based method
3.3.1.2 Algorithm
Here, we explain the algorithm of the NPCA-based method using the K-means type constraint connector matrix. When we set rbx=rby, rcx =rcy, cb =rbx, cc =rcx, Dx = I, and Dy = I, objective function gccca equals objective function gnpc. Therefore, the NPCA-based method is a special case of using theK-means type constrained connector matrix method.
Proposition 3.13.When we setZ(x)= [X1†Q](1)J
x,Kx andZ(y)= [Y1†W](1)J
y,Ky. The update
formula of Bx, By, Cx, and Cy is obtained as follows:
Bx =UbxVbx′ (3.58)
By =UbyVby′ (3.59)
Cx =UcxVcx′ (3.60)
Cy =UcyVcy′ (3.61)
where Ubx and Vbx are left and right singular matrixes of Z2(x)(Cx⊗I)′F2(x)′. Uby and Vby are left and right singular matrixes of
Z2(y)(Cy⊗I)F2(y)′. Ucx and Vcx are left and right singular matrixes of
Z3(x)(Bx⊗I)F3(x)′. Ucy andVcy are left and right singular matrixes of
Z2(y)(By⊗I)F3(y)′.
Proof. First, we explain the update formula of Bx. The term related to Bx is the first term. The first term of objective functiongccca is rewritten as follows:
∥X1Q−F1(x)(Cx⊗Bx)′∥2 =∥Z1(x)−F1(x)(Cx⊗Bx)′∥2
=∥Z2(x)−BxF2(x)(Cx⊗I)′∥2
=tr(Z2(x)′Z2(x))−2tr(Z2(x)′BxF2(x)(Cx⊗I)′) + tr((Cx⊗I)F2(x)Bx′BxF2(x)(Cx⊗I)′)
=tr(Z2(x)′Z2(x))−2tr(BxF2(x)(Cx⊗I)′Z2(x)′) + tr((Cx⊗I)F2(x)′F2(x)(Cx⊗I)′).
Given parameters exceptBx, we obtain the update formula of Bx by maximizing
tr(BxF2(x)(Cx⊗I)′Z2(x)′). From the TenBerge theorem (ten Berge, 1993), tr(BxF2(x)(Cx⊗ I)′Z2(x)′) ≤tr(D)holds. D is the diagonal matrix whose elements are singular values of V DU′ =F2(x)(Cx⊗I)′Z2′. When we set Bx =U V′, the equation holds. Therefore, we obtain the update formula. The update formula ofBy is obtained in the same way asBx. Next, we explain the update formula of Cx, which is obtained in a very similar way to Bx. The term related to Cx is the first term. The first term of objective functiongccca is rewritten as follows:
∥X1Q−F1(x)(Cx⊗Bx)′∥2 =∥Z1(x)−F1(x)(Cx⊗Bx)′∥2
=∥Z3(x)−CxF3(x)(Bx⊗I)′∥2
=tr(Z3(x)′Z3(x))−2tr(Z3(x)′CxF3(x)(Bx⊗I)′) + tr((Bx⊗I)F3(x)Cx′CxF3(x)(Bx⊗I)′)
=tr(Z3(x)′Z3(x))−2tr(BxF3(x)(Cx⊗I)′Z3(x)′) + tr((Bx⊗I)F3(x)′F3(x)(Bx⊗I)′).
Given another parameter exceptCx, we obtain the update formula of Cx by maximizing tr(CxF3(x)(Bx⊗I)′Z3(x)′). From the TenBerge theorem, tr(CxF3(x)(Bx⊗I)′Z3(x)′)≤tr(D) holds. D is a diagonal matrix whose elements are singular values ofV DU′ =F3(x)(Bx⊗ I)′Z3′. When we set Cx = U V′, the equation holds. Therefore, we obtain the update formula. The update formula ofCy is obtained in the same way asCx.
Proposition 3.14. The update formulas of F(x) and F(y) are obtained as follows:
F1(x) =(X1†Q(Cx⊗Bx) +F1(y)DyD′x)(I+DxDx′)−1, (3.62) F1(y) =(Y1†W(Cy⊗By) +F1(x)DxDy′)(I +DyDy′)−1. (3.63) Proof. First, we explain about the update formula ofF1(x).
∥X1†Q−F1(x)(Cx⊗Bx)′∥2+∥F1(x)Dx−F1(y)Dy∥2
=−2tr(F1(x)(Cx⊗Bx)′Q′X1†′)−2tr(F1(x)′F1(y)DyD′x)
+ tr(F1(x)DxDx′F1(x)′) + tr(F1(x)(Cx′Cx⊗Bx′Bx)F1(x)) + const.,
where const. is constant independence fromF1(x). Thus, the partial derivative function of gccca with respect toF1(x) is obtained as follows:
∂
∂F1(x){∥X1†Q−F1(x)(Cx⊗Bx)′∥2+∥F1(x)Dx−F1(y)Dy∥2}
=−2X1†Q(Cx⊗Bx)−2F1(y)DyD′x+ 2F1(x)DxDx′ + 2F1(x).
When we set the partial derivative function of gccca with respect to F1(x) as 0, we obtain the following equation:
−2X1†Q(Cx⊗Bx)−2F1(y)DyDx′ + 2F1(x)DxDx′ + 2F1(x) = 0
⇐⇒F1(x)DxDx′ +F1(x)=X1†Q(Cx⊗Bx) +F1(y)DyDx′
⇐⇒F1(x)= (X1†Q(Cx⊗Bx) +F1(y)DyD′x)(I+DxDx′)−1. The update formula ofF1(y) is obtained in the same way asF1(x).
∥Y1†W −F1(y)(Cy⊗By)′∥2+∥F1(x)Dx−F1(y)Dy∥2
=−2tr(F1(y)(Cy⊗By)′W′Y1†′)−2tr(F1(y)′F1(x)DxD′y)
+ tr(F1(y)DyD′yF1(y)′) + tr(F1(y)(Cy′Cy⊗By′By)F1(y)) + const.,
where const. is constant independence fromF1(y). Thus, the partial derivative function of gccca with respect toF1(y) is obtained as follows:
∂
∂F1(y){∥Y1†W −F1(y)(Cy⊗By)′∥2+∥F1(x)Dx−F1(y)Dy∥2}
=−2Y1†W(Cy⊗By)−2F1(x)DxDy′ + 2F1(y)DyDy′ + 2F1(y).
When we set the partial derivative function ofF1(y)as 0, we obtain the Following equation:
−2Y1†W(Cy⊗By)−2F1(x)DxD′y+ 2F1(y)DyD′y+ 2F1(y)= 0
⇐⇒F1(y)DyDy′ +F1(y)=Y1†W(Cy⊗By) +F1(y)DxD′y
⇐⇒F1(y)= (Y1†W(Cy⊗By) +F1(x)DxD′y)(I+DyD′y)−1.
Proposition 3.15. The update formula ofqkxjx is obtained as follows:
qkxjx =√ I(Xk†
xjx
′Xk†
xjx)−12u(qx)1 , (3.64) where u(qx)1 is the first dimension left singular vector of
(Xk†
xjx
′Xk†
xjx)−12Xk†
xjx
′Jn(∑rcx
ℓ c(x)k
xℓF(ℓ)(x)b(x)j
x ). c(x)k
xℓ is (kx, ℓ) element of Cx. b(x)j
x is the jx-th row vector of Bx. F(ℓ)(x) is the matrix corresponding to dimensionℓ of Cx.
The update formula of wkyjy is obtained as follows:
wkyjy =√ I(Yk†
yjy
′Yk†
yjy)−12u(wy)1 , (3.65) where u(wx)1 is the first dimension left singular vector of
(Yk†
yjy
′Yk†
yjy)−12Yk†
yjy
′Jn(∑rcy
ℓ c(y)k
yℓF(ℓ)(y)b(y)jy ). c(y)k
yℓ is(ky, ℓ) element ofCy. b(y)jy is thejx-th row vector ofBy. F(ℓ)(y) is the matrix corresponding to dimensionℓ of Cy
Proof. First, we explain about the update formula of qkxjx. From definition Q, qkxjx are independent from each other. Thus, the update formula of qkxjx can be calculated individually. The term that is related toqkxjx is the first term of gccca. The first term of gccca is rewritten as follows:
∥X1†Q−F1(x)(Cx⊗Bx)′∥2=−2tr(Q′X†′1F1(x)(Cx⊗Bx)′) + const. (3.66) From equation (3.66), we consider the minimization problem as theQ that maximizes tr(Q′X†′1F1(x)(Cx⊗Bx)′). From the definition ofQ, in order to maximize tr(Q′X†′1F1(x)(Cx⊗ Bx)′), we consider each value of qkxjx. Objective function g∗ for qkxjx is obtained as fol-lows:
g∗(qkxjx |Cx, Bx, Fx, X†) = tr(qk′xjxXk†
xjx
′∑rcx
ℓ
c(x)k
xℓF(ℓ)(x)b(x)j
x ),
From the constraint on qkxjx, this objective function g∗ is very similar to the objective function of canonical correlation analysis. From the constraintXj†
xkxqjxkx =J Xj†
xkxqjxkx, Xj†
xkxqjxkx is the element of complementary space of 1. Therefore, first, the Xj†
xkx is projectedJ space. Then, we search the parameters maximizingg∗. Thus, we change the objective functiong∗ tog1∗ as follows:
g∗1(qkxjx |Cx, Bx, Fx, X†) = tr(qk′xjxXk†
xjx
′J
rcx
∑
ℓ
c(x)k
xℓF(ℓ)(x)b(x)jx ).
When we set q∗kxjx = √1 I(Xk†
xjx
′Xk†
xjx)1/2qkxjx, equation qk∗xjx′q∗kxjx = 1 holds from the constraint case ofgccca. Therefore, we can rewriteg∗1 as follows:
g1∗(qkxjx |Cx, Bx, Fx, X†) =√
Itr(q∗kxjx′(Xk†
xjx
′Xk†
xjx)−12Xk†
xjx
′J
rcx
∑
ℓ
c(x)k
xℓF(ℓ)(x)b(x)j
x ).
g1∗ is the same as the objective function of the weight matrix in the categorical canonical covariance case. Therefore, we obtain the update formula ofqk∗xjx as
qk∗xjx =u(qx)1 , whereu(qx)1 is the first dimension left singular vector of (Xk†
xjx
′Xk†
xjx)−12Xk†
xjx
′J∑rcx
ℓ c(x)k
xℓF(ℓ)(x)b(x)j
x . From the definition of qk∗
xjx, the update for-mula ofqkxjx is obtained as follows:
qkxjx =√ I(Xk†
xjx
′Xk†
xjx)−12u(qx)1 .
The update formula of wkyjy is obtained in the same way asqkxjx.
The update formulas of Dx and Dy are the same as the K-means algorithm.
Proposition 3.16. The update formulas of Dbx, Dby, Dcx, and Dcy are obtained as follows:
d(bx)ℓq =
1
(
ℓ= arg min
ℓ∗
[F2(x)(Dcx⊗I)]ℓ∗−d(by)q ′F2(y)(Dcy⊗I) ) 0 (otherwise)
(q= 1, 2, . . . , cb),
(3.67)
d(by)ℓq =
1
(
ℓ= arg min
ℓ∗
[F2(y)(Dcy⊗I)]ℓ∗−d(bx)q ′F2(x)(Dcx⊗I) ) 0 (otherwise)
(q= 1, 2, . . . , cb),
(3.68)
d(cx)ℓq =
1
(
ℓ= arg min
ℓ∗
[F3(x)(Dbx⊗I)]ℓ∗−d(cy)q ′F3(y)(Dby⊗In) ) 0 (otherwise)
(q= 1, 2, . . . , cc),
(3.69)
d(cy)ℓq =
1
(
ℓ= arg min
ℓ∗
[F3(y)(Dby⊗I)]ℓ∗−d(cx)q ′F3(x)(Dbx⊗In) ) 0 (otherwise)
(q= 1, 2, . . . , cc),
(3.70)
where d(bx)ℓq , d(by)ℓq , d(cx)ℓq and d(cy)ℓq are the (ℓ, q) element of Dbx, Dby, Dcx, and Dcy, respectively. [A]ℓ is the ℓ-th column vector ofA.
Proof. From the definition ofgccca, the third term depends on Dbx, Dby, Dcx, and Dcy. We rewrite the third term ofgccca as follows:
∥F1(x)Dx−F1(y)Dy∥2 =∥F1(x)(Dcx⊗Dbx)−F1(y)(Dcy⊗Dby)∥2
=∥D′bxF2(x)(Dcx⊗I)−Dby′ F2(y)(Dcx⊗I)∥2 (3.71)
=∥D′cxF3(x)(Dbx⊗I)−Dcy′ F3(y)(Dby⊗I)∥2 (3.72) From the constraint of Dbx, Dby, equation (3.71) is equivalent to K-means given other parameters. Thus, the update formulas of Dbx, Dby are obtained in the same way as K-means.
From the constraint ofDcx, Dcy, equation (3.72) is equivalent toK-means given other parameters. Thus, the update formulas of Dcx, Dcy are obtained in the same way as K-means.
Summarizing the update formulas, we obtain the algorithm of categorical canonical covariance analysis for three-mode three-way data as algorithm 6.
Algorithm 6 Algorithm of the NPCA-based method
Set the number of dimensionsrbx, rby, rcx, rcy, cb, cc, and stop conditionε Set initial valuesBx(0), Cx(0), By(0), Cy(0), F(0)x , F(0)y , Dbx(0), D(0)by,D(0)cx,and Dcy(0), t←0
S(0)←gccca(F(0)x , F(0)y , Bx(0), By(0), Cx(0), Cy(0), Q(0), W(0), D(0)x , Dy(0)|X†, Y†) repeat
t←t+ 1
Update Bx and By using Cx(t−1), F(tx−1),Cy(t−1), F(ty−1) Update Cx and Cy using Bx(t), F(tx−1),By(t), F(ty−1)
Update Fx using Bx(t), By(t),Cx(t), Cy(t), F(ty−1),Dx(t−1), Dy(t−1) Update Fy using Bx(t), By(t),Cx(t), Cy(t), F(t)x ,Dx(t−1), D(ty−1) Update Dbx usingB(t)x , B(t)y ,Cx(t), Cy(t), F(t)x ,F(t)y , D(tcx−1), D(ty−1) Update Dcx using Bx(t), By(t),Cx(t), Cy(t), F(t)x ,F(t)y , D(t)bx, D(ty−1) Update Dby using Bx(t), By(t),Cx(t), Cy(t), F(t)x ,F(t)y , Dx(t), Dcy(t−1) Update Dby using Bx(t), By(t),Cx(t), Cy(t), F(t)x ,F(t)y , Dx(t), Dby(t)
S(t)←gccca(F(t)x , F(t)y , Bx(t), By(t), Cx(t), Cy(t), Q(t), W(t), Dx(t), Dy(t)|X†, Y†) until |S(t−1)−S(t)| ≤ε
Chapter 4
Simulation studies
In this chapter, we compare canonical covariance analysis for three-mode three-way data with that for two-mode two-way data using several evaluations. In the K-means based method case, we compare our proposed method with two-mode two-way canonical covariance analysis and no connector matrixes with the mean squared loss between the true and estimated weight matrixes. In the regression-based method case, we compare our proposed method with two-mode two-way methods and three-mode three-way regression using mean squared loss of prediction. In the quantification method case, we compare our proposed method with two-mode two-way canonical covariance analysis, no connector matrixes, and no quantification method using the mean squared loss between the true and estimated weight matrixes.
4.1 Constrained connector method
In this section, we describe a numerical example for the contained connector method.
The purpose of introducing the constraint for parameters is different between theK-means type and the regression type. Therefore, we separate these two types. In theK-means type situation, we evaluate the squared error of parameters. In the regression type situation, we focus on the prediction error.