060310391
0560565
14
2017/1/16 13:00-14:45
@1 - 4
1
…
DNA
2
•
•
•
•
– k-means
SOM
•
– SVM
3
•
mRNA
4
h>p://www.scq.ubc.ca/spot-your-genes-an-overview-of-the-microarray/ Art by Jiang
Long
cDNA
2
2
5
cDNA
= /
h>p://www.promega.com/enotes/applicaRons/ap0066.htm
6
GeneChip
1
mRNA
→ cDNA
→ cRNA
→ cRNA
GeneChip
20 25
10 20
Perfect Match (PM) Mismatch
(MM)
MM PM 1
h>p://www.scq.ubc.ca/spot-your-genes-an-overview-of-the-microarray/ Art by Jiang Long 7
GeneChip PM MM
• 1
• PM MM
Schadt et al. (2000) J Cell Biochem 80: 192
8
Perfect Match PM Mismatch MM
RNA-seq
9
Wang et al. (2009) Nat Rev Genet. 10: 57–63
RNA-seq
Conesa et al. (2016) Genome Biology 17:13 10
‘align-then-assemble’ and ‘assemble-then-align’
11
Haas and Zody (2010)
Nat. Biotech. 28: 421-423
•
–
–
–
•
–
•
–
-
12
•
– hierarchical clustering
– non-hierarchical clustering)
–
→
13
1 2 3 4
1 1.53 2.38 2.80 0.60
2 1.03 2.54 3.29 0.80
3 0.85 0.21 0.34 3.02
4 1.03 0.82 0.94 1.20
:
1.0 0.95 -0.92 -0.88
0.95 1.0 -0.74 -0.77
-0.92 -0.74 1.0 0.93
-0.88 -0.77 0.93 1.0
1.0 0.95 0.92 0.88
0.95 1.0 0.74 0.77
0.92 0.74 1.0 0.93
0.88 0.77 0.93 1.0
0.0 0.74 4.13 2.55
0.74 0.0 4.36 0.77
4.13 4.36 0.0 2.02
2.55 2.94 2.02 0.0
(
r =
(x
i− x
i=1 n
∑ )(y
i− y )
(x
i− x
i=1 n
∑ )
2(y
i− y
i=1 n
∑ )
214
15
1
1
1
1
1.0 1.5 2.0 2.5 3.0 3.5 4.0
0 .5 1 .0 1 .5 2 .0 2 .5 3 .0
Sample ID
Exp re ssi o n l e ve l
2
2
2
3 2
3 3
3
4 4 4
4
• 1
• dendrogram
•
•
16
d(x i , x j ) = (x i1 − x j1 )
2 ++ (x ip − x jp ) 2
x i = (x i1 ,…, x ip ), x j = (x j1 ,…, x jp )
(Euclidean distance)
d(x i , x j ) = x i1 − x j1 ++ x ip − x jp
(Minkowski distance)
d(x i , x j ) = max x { i1 − x j1 ,…, x ip − x jp }
(Maximum distance)
d(x i , x j ) = x i1 − x j1
p ++ x ip − x jp
1/p p
(Manha>an distance)
17
p = 1
p = 2
p → ∞
d(x i , x j ) =
x i1 − x j1
x i1 + x j1 ++
x ip − x jp
x ip + x jp
(Canberra distance)
1.
2.
1
3.
4. 1
2-3
18
0 1 2 3 4 5
0 1 2 3 4 5
Expression level in Exp.1
Exp re ssi o n l e ve l in Exp .2
–
1
2
3 4
1 5
2
3
4
1
g e n e 1 g e n e 2 g e n e 5 g e n e 3 g e n e 4
0 .5 1 .0 1 .5 2 .0 2 .5 3 .0
Dendrogram
hclust (*, "average")
D ist a n ce
2
3
4
2
19
(1)
1. nearest neighbor method
a.k.a. single linkage
2. furthest neighbor method
a.k.a. complete linkage
A B
20
(2)
3. group average method
2. centroid method
× ×
21
(3)
4. median method
5. Ward’s method
× ×
×
×
×
×
×
d(A,B) = E(A B) - E(A) - E(B) 22
0 1 2 3 4 5
012345
Expression level in Exp.1
Expression level in Exp.2
g e n e 3 g e n e 4 g e n e 5 g e n e 1 g e n e 2 0 .5 1 .0 1 .5 2 .0 2 .5 3 .0
Complete linkage
hclust (*, "complete")
D ist a n ce
gene1
gene2 gene5
gene3
gene4
g e n e 1 g e n e 2 g e n e 5 g e n e 3 g e n e 4
0 .8 1 .0 1 .2 1 .4 1 .6 1 .8
Single linkage
hclust (*, "single")
D ist a n ce
dendrogram
→ 23
•
g e n e 3 g e n e 4 g e n e 5 g e n e 1 g e n e 2
0 .8 1 .2 1 .6
Euclidean distance
hclust (*, "single")
dist(c, method = "euclidian")
H e ig h t g e n e 3 g e n e 4 g e n e 5 g e n e 1 g e n e 2 0
.6 1 .0 1 .4
Maximum distance
hclust (*, "single")
dist(c, method = "maximum")
H e ig h t g e n e 3 g e n e 4 g e n e 5 g e n e 1 g e n e 2
1 .2 1 .6 2 .0
Manhattan distance
hclust (*, "single")
dist(c, method = "manhattan")
H e ig h t g e n e 1 g e n e 2 g e n e 5 g e n e 3 g e n e 4 0 .1
5 0 .2 5 0 .3 5
Canberra distance
hclust (*, "single")
dist(c, method = "canberra")
H e ig h t
24
•
→
• 1-r ij 1-|r ij |
25
(1)
•
• 0, 15, 30 1, 2, 3,
4, 8, 12, 16, 20, 24
• 0
•
Eisen et al. (1998) PNAS 95: 14863
Cholesterol
biosynthesis
Cell cycle ( )
Immediate-early
response
Signaling &
angiogenesis
( )
Wound healing &
Tissue remodeling
→ 26
…
…
1 2 3 4
1 1.53 1.03 0.85 1.03
2 2.38 2.54 0.21 0.82
3 2.80 3.29 0.34 0.94
4 0.60 0.80 3.02 1.20
→
27
(2)
• 60
cell lines
•
Ross et al. (2000) Nat Genet 24: 227
28
•
• k-means
•
(Self-organizing maps SOM
29
k-means
1. k k
2. k
3.
4. 2-3
1
30
k-means –
Bishop CM (2006) Pa>ern recogniRon and machine learning. Springer.
31
• 386 1311SNPs
• k− 5
•
32
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●●●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●●●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●●●
●
●
●●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●● ●
−20 −10 0 10 20
−1001020
Rep 0
PC1
PC2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
● ●●●
●
●
●
●
●
●
●
●●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●●
●
●
●
●
●
●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●● ●●●●
●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
−10 −5 0 5 10 15
−10−505101520
Rep 0
PC3
PC4
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●
●
●
●
●●
●
●
●
●
●
●
● ●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●●
●
●
●●●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●● ●
−20 −10 0 10 20
−1001020
Rep 1
PC1
PC2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●●
●
●●
●
●
●
●●
● ● ●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
● ●●●
●
●
●
●
●
●
●
●●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
● ●
●●
●
●
●
●
●
●●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●● ●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
−10−5 0 5 10 15
−10−505101520
Rep 1
PC3
PC4
×
1
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●●
●
●
●●
●
●●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
● ●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●● ●
−20 −10 0 10 20
−1001020
Rep 2
PC1
PC2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
● ●●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●●
●●
●
● ●●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●● ●●●●
●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
−10 −5 0 5 10 15
−10−505101520
Rep 2
PC3
PC4
2
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●
●
●
●
●●
●
●●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●● ●
−20 −10 0 10 20
−1001020
Rep 10
PC1
PC2
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ●
●
●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●● ●
● ●●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●● ●
●
●
●
●
●
●●
●●
●
●
●
●
●●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
● ●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●● ●●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
−10 −5 0 5 10 15
−10−505101520
Rep 10
PC3
PC4
10
•
• L
Soukas et al. (2000) Genes & Development 14: 963
33
k-medoids
• : 229
22
•
48
• k-medoids R
cluster pam
48 medoids
• 48
● 48
• 229
48
34
-6 -4 -2 0 2 4 6
-4 -2 0 2
PC1
PC 2
-4 -2 0 2 4 6
-2 0 2 4
PC3
PC 4
PC 1 all
pca.tr$x[, i]
Frequency
-6-4-20246
010203040
PC 1 k-medoids
pca.tr$x[kmed$id.med, i]
Frequency
-6-4-20246
0246810
PC 2 all
pca.tr$x[, i]
Frequency
-4 -2 0 2 4
01020304050
PC 2 k-medoids
pca.tr$x[kmed$id.med, i]
Frequency
-4 -2 0 2 4
0246810
PC 3 all
pca.tr$x[, i]
Frequency
-4-20246
01020304050
PC 3 k-medoids
pca.tr$x[kmed$id.med, i]
Frequency
-4-20246
0246810
PC 4 all
pca.tr$x[, i]
Frequency
-4-20 24 6
020406080
PC 4 k-medoids
pca.tr$x[kmed$id.med, i]
Frequency
-4-20 24 6
051015
• medoid
• medoid
•
1. 5x5
2. w(0)
3. g i 1
4. g i
BMU (best matching unit)
5. BMU BMU
g i
w(t+1) = w(t) + h(d, t)(g i – w(t))
ht(d, t) = θ(d, t)α(t)
θ(d, t) α(t)
6. 3-5
θ(d, t) α(t)
35
g i = (0.70, 0.23, 0.31)
w BMU = (0.60, 0.26, 0.30)
g i
BMU
w(t +1) = w(t) + h(d,t)(g i − w(t))
BMU
h(0,t) = 0.8 BMU
w (t +1) =
0.60
0.26
0.30
"
#
$
$ $
%
&
'
' '
+ 0.8 ×
0.70
0.23
0.31
"
#
$
$ $
%
&
'
' ' −
0.60
0.26
0.30
"
#
$
$ $
%
&
'
' '
"
#
$
$ $
%
&
'
' '
=
0.52
0.29
0.29
"
#
$
$ $
%
&
'
' '
h(1,t) = 0.4 h(d>1, t) = 0
w (t +1) =
0.66
0.72
0.72
"
#
$
$ $
%
&
'
' ' + 0.4 ×
0.70
0.23
0.31
"
#
$
$ $
%
&
'
' ' −
0.60
0.72
0.72
"
#
$
$ $
%
&
'
' '
"
#
$
$ $
%
&
'
' '
=
0.68
0.53
0.55
"
#
$
$ $
%
&
'
' '
(2,3)
36
• h>p://
genomics.stanford.edu
• 6 × 5 828
•
•
•
Tamayo et al. (1999) PNAS 96: 2907
37
SOM
38
•
•
39
• supervised learning
–
1 → (classificaRon)
→ regression
• SVM
Random Forest
• unsupervised learning
–
• k-means
40
support vector machine (SVM)
•
•
•
•
41
SVM
42
basis funcRon)
(kernel funcRon
feature space)
input space)
43
k(x,z) = x T z
k( x,z) = (x T z + c) M
k(x,z) = exp − x − z
2
2 σ 2
$
%
&
&
'
(
) )
44
45
•
y = 5sin(x) + e
e ~ N (0, 1)
0 2 4 6 8 10
-6 -4 -2 0 2 4 6
linear regression
data$x
d a ta $ y
•
•
-3 -2 -1 0 1 2 3
0.00.20.40.60.81.0
Shape of kernel (beta = 1)
x
exp(-beta * x^2)
• 2
• 2 x i , x j
• x, y
k(x j , x i ) = exp − β x j − x i
( 2 )
y = f (x) = α j
j=1
n
∑ k(x j , x)
x x j k(x j , x)
α j
y x
x
…
φ
feature space)
input space)
x φ x
y = m =1 w m x m
∑ M + e = w T x + e y = ∑ k=1 K w k φ k (x) + e = w T φ(x) + e
w = j=1 α n φ(x j )
∑ n
y = α j φ (x j )
T φ (x)
j=1
∑ n + e = α j k(x j , x)
j=1
∑ n + e
47
K =
k(x 1 , x 1 ) k(x n , x 1 )
k(x 1 , x n ) k(x n , x n )
⎛
⎝
⎜ ⎜
⎜
⎞
⎠
⎟ ⎟
⎟
•
R( α ) = y i − α j
j=1
∑ n k(x j , x i )
⎛
⎝⎜
⎞
⎠⎟
2
i=1
∑ n
= (y − K α ) Τ ++ (y − K α )
•
• R α R α α
α = (K Τ K) −1 K T y = K −1 y
0 2 4 6 8 10
-6 -4 -2 0 2 4 6
kernel regression without regularization
data$x
d a ta $ y
…
•
overfivng
•
48
R( α ) = (y − K α ) Τ " (y − K α ) + λα Τ K α
•
λ
• R α R α α
α = (K + λ I) −1 y
0 2 4 6 8 10
-6 -4 -2 0 2 4 6
kernel regression (lmbd = 0.4 )
data$x
d a ta $ y
0 2 4 6 8 10
-6-4-20246
kernel regression (lmbd = 0.04 )
data$x
data$y
0 2 4 6 8 10
-6-4-20246
kernel regression (lmbd = 4 )
data$x
data$y