情報損失指標の非数値データへの適用
全文
(2) 情報処理学会研究報告 IPSJ SIG Technical Report. Vol.2017-MPS-113 No.10 Vol.2017-BIO-50 No.10 2017/6/23. [3] [4] [5] [6] [7] [8], [9], [10]. ( I(A) > I(A). ILD. ILSSDM ILD. ILSSDM ILD =. 4. 2.. ILD. ( I(A) − I(A) I(A). ILD. 2.1. 2.1 (. ). D = (1, 2, 3, 4). D1 = (1, 2). A = (x1 , . . . , xN ), xi ∈ X(i = 1, . . . , N ) X. 2. x, y. X. •. 2 D. ( = 32 I(D) = 40 I(D). X = Rn. x, y ∈ Rn. 2.2 (. ! " n "$ d(x, y) = # (xi − yi )2 x, y ∈ X ⎧ ⎨1 d(x, y) = ⎩0. X. 1. S( = (a1 , a1 , a2 , a2 ). a1 a2 2. (x ̸= y). S. (x = y) 2. S(. X. X. ( = 32 I(S) = 144 I(S) ILD =. S (( S = (a, a, a, a). A. I(A) =. = 0.2. S1 = (a11 , a12 ) S. ILD =. A. 40−32 40. ) 4. S2 = (a21 , a22 ). A A⊆X. x. ( D. S1 S2. X. x, y. ILD =. S = (a11 , a12 , a21 , a22 ). i=1. •. D2. ( = (1.5, 1.5, 3.5, 3.5) D. d(x, y). •. D2 = (3, 4). D1. a. 144−32 144. = 0.778. ( S(. 144−0 144. (( I(S) =0. =1. 2. y I(A). N $ N $. d(xi , xj )2. i=1 j=1. 1. 2.2 ILD. 3. ILD A = (x11 , . . . , x1n1 , x21 , . . . , x2n2 , . . . , xm1 , . . . , xmnm ). ILD. A1 , . . . , A m Ak = (xk1 , . . . , xknk ) (k = 1, . . . , m) Ak x ˆk. 3.1 Ak. ILD 8. D D=( ). ( = (ˆ ˆ1 , x ˆ2 , . . . , x ˆ2 , . . . , x ˆm , . . . , x ˆm ) A x1 , . . . , x. ⓒ 2017 Information Processing Society of Japan. 2.
(3) 情報処理学会研究報告 IPSJ SIG Technical Report. ( D=(. Vol.2017-MPS-113 No.10 Vol.2017-BIO-50 No.10 2017/6/23. ). ILD. (1) ( = 48 Idiscr (D) = 56 Idiscr (D) ILDdiscr = 0.14285. (2). D. ( D D. 1. ( D D. ( = 576 It (D). ! D. 4. 2. I(D). It (D) = 1440. 56. ILDt = 0.6. 1440. ! I(D). 648. ILD ILD. 48. 0.14285. 576. 0.60000. 640. 0.01234. 3.2 ILD. Rn 1≤p≤∞ 2. dp (x, y) =. ⎧) ⎨( n. i=1. ⎩max. d2 (x, y). (3). 6. ( D D. 1. |xi − yi |p ) p. i=1,...,n. |xi − yi |. p=∞. ILD. 2. 2.1. p=1 2. 2 p=1. 1. I(D). ( = 640 Ig (D) = 648 Ig (D). 56. ILDg = 0.01234. 3. 1≤p<∞. 3.1. 3 4 4. p−. D. 272 168. ! I(D). ILD ILD. 48. 0.14285. 160. 0.41176. 160. 0.04762. 3.1. 1 ⓒ 2017 Information Processing Society of Japan. 3.
(4) 情報処理学会研究報告 IPSJ SIG Technical Report. 4. ILD. Vol.2017-MPS-113 No.10 Vol.2017-BIO-50 No.10 2017/6/23. I(X) =. ILSSDM. $ $. x∈X y∈X. =. $ $. x∈X y∈X. = ILD. $ $. x∈X y∈X. ILSSDM. |x − y|2 |(x − x) + (x − y)|2 (|x − x|2 + |x − y)|2 ). = 2N 2 var(X). (3). 4.1 ILSSDM N X1 , . . . , X m xij (0 ≤ j ≤ ni ). Xi. ni. X ∈ Rn. Xi 2 SSE =. 4.3 xi. ILD. X. i=1 j=1. |xij − xi |2. I(X ′ ) =. SSA =. i=1. SST =. i=1 j=1. m $ m $. ni nj |xi − xj |2. (4). (2), (4). ni |xi − x|2. I(X) − I(X ′ ) = 2. ni m $ $. X′. i=1 j=1. x m $. Xi. xi. (Sum of Squared Errors). ni m $ $. ILSSDM. X1 , . . . , X m. m $ m $. ni nj (var(Xi ) + var(Xj )). i=1 j=1. =2. m $ m $. ni nj var(Xi ). i=1 j=1. |xij − x|2 = SSE + SSA. = 2N. m $. ni var(Xi ). (5). i=1. ILSSDM (3), (5). I(X) − I(X ′ ) I(X) )m ni var(Xi ) = i=1 N var(X). SSE SST − SSA = ILSSDM = SST SST Xi. var(Xi ) ILSSDM =. X. )m. var(X). ni var(Xi ) N var(X). i=1. ILD =. ILD. (1). (6). ILSSDM. 4.4 ILD. N. 2. O(N ). 4.2. R. 3.1 I(X) =. m $ m $ $ $. i=1 j=1 x∈Xi y∈Xj. =. m $ m $ $ $. i=1 j=1 x∈Xi y∈Xj. =. m $ m $ $ $. i=1 j=1 x∈Xi y∈Xj. =. m $ m $ i=1 j=1. ILSSDM ILSSDM. |x − y|. O(N ). n. ILD. 2. |(x − xi ) + (xi − xj ) + (xj − y)|2. 5.. (|x − xi |2 + |xi − xj |2 + |xj − y|2 ). ILD. ni nj (var(Xi ) + |xi − xj |2 + var(Xj )) (2) X. ILD. x. ⓒ 2017 Information Processing Society of Japan. 4.
(5) 情報処理学会研究報告 IPSJ SIG Technical Report. ILD. Vol.2017-MPS-113 No.10 Vol.2017-BIO-50 No.10 2017/6/23. ILD. p−. ILD ILSSDM ILD. [1]. [2]. [3]. [4]. [5]. [6]. [7]. [8]. [9]. [10]. ILSSDM. Domingo-Ferrer and Vicen c Torra. Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Mining and Knowledge Discovery, Vol. 11, No. 2, pp. 195–212, 2005. Domingo-Ferrer, Josep and Mart´ınez-Ballest´e, Antoni and Mateo-Sanz, Josep Maria and Seb´e, Francesc, Efficient multivariate data-oriented microaggregation, The VLDB Journal The International Journal on Very Large Data Bases, Vol. 15, No.4, pp. 355–369, 2006. Anthony WF Edwards and L Luka Cavalli-Sforza. A method for cluster analysis. Biometrics, pp. 362–375, 1965. AD Gordon and JT Henderson. An algorithm for euclidean sum of squares classification. Biometrics, pp. 355–362, 1977. Pierre Hansen, Brigitte Jaumard, and Nenad Mladenovic. Minimum sum of squares clustering in a low dimensional space. Journal of Classification, Vol. 15, No. 1, pp. 37–55, 1998. James MacQueen, et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1, pp. 281–297. Oakland, CA, USA., 1967. Joe H Ward Jr. Hierarchical grouping to optimize an objective function. Journal of the American statistical association, Vol. 58, No. 301, pp. 236–244, 1963. Agusti Solanas, Antoni Martinez-Balleste, and J Domingo-Ferrer. V-mdav: a multivariate microaggregation with variable group size. In 17th COMPSTAT Symposium of the IASC, Rome, pp. 917–925, 2006. Oganian, Anna and Domingo-Ferrer, Josep. On the complexity of optimal microaggregation for statistical disclosure control. Statistical Journal of the United Nations Economic Commission for Europe, Vol. 18, No. 4, pp. 345–353, 2001. Domingo-Ferrer, Josep and Mateo-Sanz, Josep Maria. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and data Engineering, Vol. 14, No. 1, pp. 189–201, 2001.. ⓒ 2017 Information Processing Society of Japan. 5.
(6)
関連したドキュメント
The issue of classifying non-affine R-matrices, solutions of DQYBE, when the (weak) Hecke condition is dropped, already appears in the literature [21], but in the very particular
By the algorithm in [1] for drawing framed link descriptions of branched covers of Seifert surfaces, a half circle should be drawn in each 1–handle, and then these eight half
The last sections present two simple applications showing how the EB property may be used in the contexts where it holds: in Section 7 we give an alternative proof of
Greenberg and G.Stevens, p-adic L-functions and p-adic periods of modular forms, Invent.. Greenberg and G.Stevens, On the conjecture of Mazur, Tate and
It is not a bad idea but it means that since a differential field automorphism of L|[x 0 ] is given by a birational transformation c 7→ ϕ(c) of the space of initial conditions, we
The limiting phase trajectory LPT has been introduced 3 as a trajectory corresponding to oscillations with the most intensive energy exchange between weakly coupled oscillators or
The proof uses a set up of Seiberg Witten theory that replaces generic metrics by the construction of a localised Euler class of an infinite dimensional bundle with a Fredholm
Using the batch Markovian arrival process, the formulas for the average number of losses in a finite time interval and the stationary loss ratio are shown.. In addition,