名古屋大集中講義 iwatawiki lec06 s

(1)

6 2016/11/30 13:00-14:30

•  2050

1.7

• 2050 90 …

Tester M, Langridge P(2010) Science 327: 818

SSA

• 1975

/ 292

1982

1984

(2)

Meuwissen et al. (2001) GeneGcs 157:1819

DNA 5

DNA

•  ^DNA

hKp://www.illuminakk.co.jp/product/genoTYPING/bovinesnp50_beadchip.shtml

Illumina BovineSNP50 BeadChip

24 54,609 DNA

…

Garcia-Ruiz et al.

Proc Natl Acad Sci 113(28): E3995-4004

“The most drama8c response to

genomic selec8on was observed

for the lowly heritable traits

DPR, PL, and SCS. Gene8c trends

changed from close to zero to

large and favorable, resul8ng in

rapid gene8c improvement in

fer8lity, lifespan, and health in a

breed where these traits eroded

over 8me.”

(3)

(

, ,

)

Watanabe et al. (2005)

Ann Bot. 95:1131

…

DNA

_DNA

DNA

LD

(4)

GS

DNA

y

_i

: i

x

_ij

: i j

y

x ₃

x ₄

x ₁

x ₂

x _K ^{y =} ^{f (x} 1 ^, ^x 2 ^{,, x} ^K ⁾

(y) DNA (x _i )

f (x)

f (.)

y DNA x ₁ ,...,x _K

vs.

Watanabe et al. (2005)

Ann Bot. 95:1131

DNA

y i = µ + β _j x ij + e i

j= 0

N

∑

...

DNA

Watanabe et al. (2005)

Ann Bot. 95:1131

DNA

1 hKp://www.soran.net/index_html/A0084070.htm

1 3

F1

6 GS GS

GS GS 3

(5)

x

y = f (x) \\\

_x

y

273 142

304

423

173 373 234 138

203 133

223 y = _{f (x)}

“large p small n” problem

p >> n

x, y 1

↓

60K …

…

y _i = µ + β _j x _ij + e _i

j=0

J

∑

PRESS

S _e

GS

0 2 4 6 8 10

0 2 4 6 8 1 0 1 2

0 2 4 6 8 10

0 2 4 6 8 1 0 1 2

x

y

0 2 4 6 8 10

0 2 4 6 8 1 0 1 2

x

b

y ₌ 0 ₊ 1

∑

=

+

=

7

1

0 k

k

k ^x

b

y

(6)

(n-fold cross-validation)

1. n

2.  i

3. _i 2

4.  2, 3 n

5. 

2 PRESS

n leave-one-out

y _ˆ i

y i

y _ˆ i

y i

, LASSO

E _{(k )} = ( y _i − u − β _j x _ij

j

J

∑ ⁾ ²

i

∑ ⁺ ^λ ^β ^j ^k

j

J

∑

y _i = u + β _j x _ij

j

J

∑ ^{+ e} ⁱ

k _{= 1} _LASSO

k _{= 2}

-2 -1 ⁰ 1 2

0 .0 0 .5 1 .0 1 .5 2 .0

x

y

β _j

•  Zhao 2011; Nature CommunicaGons 2:467

•  Rice Diversity hKp://www.ricediversity.org/

data/

•  ³⁹⁵ × 1311 SNPs

•  (brown.rice.seed.length)

ridge regression, lasso

ridge lasso

●

●^●

●

●●

● ●●

●

● ●

●

● _●

●

●●

●

● ●

●

●^● _●

●

● ●

●

●●

●

●●

●

● ●

●

●●

●

● ●

●

●●

●

● ●

●

● ●

●

●●

●

●^●

●

●●

● ●

●

● ^●

●

● ●

● ^●

●

● ●

●

● ● ^●

● ●● ●

● ●

●

●●

●

● _●

●●

●

● ●

●

● ●

●

●●

●

● ^●

●●

●

● ^{● ●}

●

● ●

●

● ^●

4 5 6 7 8

Observed seed length

Pre d ict e d se e d l e n g th

r = 0.72

mse = 0.28

●

● ^●

●

● ●

●

●^●

●

● ●

●

●●

●

●●

●

● ●

●

● ●

●

●●

●

● _●

● ●

●

● ●

●

●●

●

● ^●

●

●●

●

●●

●

● ^●

●

● ^●

● _●

●

● ^●

●

● ●

●

● ●

●

●_●

●

● ●

●

● ●

●

●●

●

● ●

●

4 5 6 7 8

Observed seed length

Pre d ict e d se e d l e n g th

r = 0.77

mse = 0.24

(7)

Ridge, LASSO

Ridge LASSO

0 shrink

GS3

↓

0e+00 1e+08 2e+08 3e+08

− 0 .0 1 0 .0 0 0 .0 1 0 .0 2

Ridge

Position (bp)

C o e ff ici e n ts

0e+00 1e+08 2e+08 3e+08

− 0 .1 0 0 .0 0 0 .1 0 0 .2 0

LASSO

Position (bp)

C o e ff ici e n ts

ElasGc net

argmin

b

(y _i _{− x} i

T _b) ²

i

∑ ⁺ ^λ( ^1−α

2 b ² _{+α b}

1 ⁾

#

$ % ^&

' (

y _i = x _ij b _j

j

J

∑ ^{+ e} ⁱ ^{= x} ⁱ ^T ^b ^{+ e} ⁱ

b

1 ⁼ ^b ^j

j

J

∑

b ² ₌ _b

j

2 j

J

∑

α = 1: lasso

α = 0: ridge

ElasGc Net

Ridge ^LASSO

●

●^●

●

●●● ^●●

●

● ●

●

● ●

●

● ●

●

● ●

●

●●

●

● ●

●

●●

●

●^●

●

●●

●

● ●

●

● ●

●

●●

●

●^●

●

●●

●

●●

●

●● ^●

●

● ●

●

● ●^●●

●● ●

● ●

●

●●

●

● ●

●

● ●

●

●●

●

● ●

●●

●

●● ●

●

● ●

●

●●

●

● ●

4 5 6 7 8

45678

Observed seed length

Predicted seed length

r = 0.72 mse = 0.28

●

● ●

●

● ●

●

●●

●

● ●

●

● ^●

●

●●

●

●●

●

● ●

●^●

●

●●

●

●●

● ●

●

●●

●

● ●

●

● ●

●

●●

●

●●

●

● ^●

●

●^● ● ●

●

● ●

●

●^●

● ●

●

●●

●

● ●

●

●● ^●

●

● ●

●

●●

●

● ●

●

● ●

●

4 5 6 7 8

45678

r = 0.77 mse = 0.24

●

● ^●

●

● ●

●

●^●

●

● ●

●

● ●

●

●●

●

●^●

●

●●

●

●●

●

●●

●

● ●

●

● ^●

●

●●

●

●●

●

●●

●

● ●

●

●●

●

● ●

●

●●

●

● ●

●

●^● ● ●

●

● ^●

●●

●^●

●

● ●

●

●●

●

● ●

●

● ●

●

●●

●

● ●

●

● ●

●

4 5 6 7 8

45678

r = 0.77 mse = 0.24

ElasGc Net

X'Xb + ^λ b = (X'X + ^λ I)b = X'y

⇔ b = (X'X + ^λ ^I p ⁾

−1 X'y

argmin

b

(y _i − x _i ^T ^b ) ²

i

∑ ⁺ ^λ ^b ²

#

$ % ^&

' (

b = ^λ ⁻¹ X'(y − Xb) = X'a

⇔ a = (XX'+ ^λ ^I n ⁾

−1 y

(λ > 0

E b

b

p × p

n × n

ˆy _i ^{= x} _i ^T ^b ^{= x} ^T _i ^X'a ^{= x} _i ^T ^a _j ^x _j

j

∑ n ⁼ ^a j ^x i

T _x

j j

∑ n

i j

a _j

(8)

b = ^λ ⁻¹ X'(y − Xb) = X'a

a = ^λ ⁻¹ ⁽ y − Xb) = ^λ ⁻¹ ^{(y − XX'a)}

⇔ λ a = y − XX'a

⇔ (XX' − ^λ I n ⁾ a = y

⇔ a = (XX'− ^λ ^I n ⁾

−1 y

RR-BLUP

Ridge Kernel Ridge

●

●^●

●

●● ● ^●^●

●

● ●

●

● _●

●

●●

●

● ●

●

●^● _●

●

● ●

●

●●

●

● ●

●

●●

●

● ●

●

●●

●

●_●

●

●●

●

● ●

●

● ●

●

●●

●

●●

●

● ●

●

● ^●

●

● ●

● ^●

●

● ●

●

● ● ●

● ●● ●

● ●

●

●●

●

● ●

●

● ●

●

●●

●

● ●

●

● ^{● ●}

●

● ●

●

●_●

●

● ●

4 5 6 7 8

Observed seed length

Pre d ict e d se e d l e n g th

r = 0.72

mse = 0.28

●●

●

● ^●

●

●^●

●

● ^●

●

● _●

●

● ^●

●

●●

●

● ●

●

● ^●

●

●●

●

●●

●

●●●

●

● ●

●

● ●

●

● ●

●

● _●

●

● ●

●

● ●

●

● ●

●

● ●

●

● ●

●

● ●

●

● _●

●

● ●

●

● _●

●

● ●

●

● ●

●

●^●

●

● ●

●

4 5 6 7 8

Observed seed length

Pre d ict e d se e d l e n g th

r = 0.72

mse = 0.28

x y

•  x φ(x) basis function .

•  φ(x) (x, z) = φ x) ^T φ(z)

.

• x _φ _x

x y

y ^y

a = (XX'+ λ I) ⁻¹ y

XX' =

x ₁ ^T x ₁ x ₁ ^T x ₂ ^ x ₁ ^T x N

x ₂ ^T x ₁ x ₂ ^T x ₂ x ₂ ^T x ^N

  

x N

T x ₁ x ₂ ^T x N ^ x N

T x N

!

"

# #

#

$

%

&

y _i _{= x} _i ^T ^w + e _i

(9)

a = (G + λ I) ⁻¹ y

G ₌

φ(x ₁ ) ^T φ(x ₁ ) φ(x ₁ ) ^T φ(x ₂ ) ^ φ(x ₁ ) ^T φ(x N ⁾

φ(x ₂ ) ^T φ(x ₁ ) φ(x ₂ ) ^T φ(x ₂ ) φ(x ₂ ) ^T φ(x N ⁾

  

φ(x N ⁾

T φ(x ₁ ) φ(x ₂ ) ^T φ(x N ^{) } φ(x N ⁾

T φ(x N ⁾

!

"

# #

#

$

%

&

y _i = _φ(x _i ) ^T ^w + e _i

y _i = φ (x _i ) ^T ^w + e _i

= α _j κ _(x _i _{, x} _j )

j=1

n

∑ ^{+ e} ⁱ

a = (G + λI) ⁻¹ y

κ _(x _i _{, x} _j _{) = x} _i ^T _x _j

G _ij = κ _(x _i _{, x} _j ) = φ _(x _i ) ^T φ _(x _j )

a = (G + λ I) ⁻¹ y

G ₌

κ (x ₁ , x ₁ ) κ (x ₁ , x ₂ ) ^ κ (x ₁ , x _N )

κ (x ₂ , x ₁ ) κ (x ₂ , x ₁ ) κ (x ₂ , x _N )

  

κ (x _N , x ₁ ) κ (x _N , x ₂ )  κ (x _N , x _N )

!

"

# #

#

$

%

&

y _i = _φ(x _i ) ^T ^w + e _i

κ (x, x ⁿ ) = ^φ (x)' ^φ (x ⁿ ⁾

= exp −h(x − x n ⁾

( 2 )

36 h = 2 / d ² ^median

(10)

Kernel ridge Gaussian kernel

●

● ^●

●

●●

●

● ^●

●

● ●_●

●

● ^●

●

●●

●

● ●

●

● ●

●

● ●

●

●●●

●

● ●

●

● _●

●

● ●

●

● ●

●

● _●

●

● ●

●

● ●

●

●●

●

● ●

●

● ●

●

● ●

●

● _●

●

● ●

●

● _●

●

● ●

●

● ●

●

● ●

●

●^●

●

● ●

●

4 5 6 7 8

Observed seed length

Pre d ict e d se e d l e n g th

r = 0.72

mse = 0.28

●●

●

●●

●

● ^●

● ●

●

● ^●

●

●●

●

● ●

●

●●^●

●

●_●

● ^●

●

● _●

●

● ●

●

● _●

●

● ●

●

●^●

● _●

●

●●

● ●

●

● ●

●

● _●● ●

● ●

●

● ^●

●

●●

●

● ●

● ●_●

●

● ●

●

4 5 6 7 8

Observed seed length

Pre d ict e d se e d l e n g th

r = 0.73

mse = 0.27

GS R

•  ridge regression, LASSO, elasGc net

–  glmnet etc.

•  BLUP

–  rrBLUP

•  SVM, RVM

–  kernlab etc.

•  random forest

–  randomForest etc

•  Bayesian linear regression (Bayesian ridge, Bayesian LASSO)

–  BLR etc

•  RKHS regression

–  RKHSw

Crossa et al. (2010) GeneGcs 186: 713 R

GS

Zhong et al. (2009) GeneGcs 182: 355

Crossa et al. (2010) GeneGcs 186: 713

Iwata and Jannink (2011) Crop Sci 51: 1915

Heﬀner et al. (2011) Crop Sci 51: 2597

GS

¹ GS1 GS2 ² GS3 GS4 ³ GS5 GS6

GS1

1 GS1

(

)

GS1

n = 48

GS2

1 GS2

×

2 n = 192

× ×

×

n = 12

4 +

y=f(x)

14,598 DNA

Hiseq

2000

Illumina HP

14,598 DNA

Hiseq

2000

Illumina HP

(11)

150 100 cm

500 g/L

12

4

25 30 g

× 0.728

× 0.550

× 0.306

× 0.053

× 0.015

× 0.011

× -0.001

= 109.2

= 55.0

= 153.0

= 0.636

= 0.060

= 0.275

= -0.030

318.1

11 kg/10a

12 y=f(x)

14,598 DNA

Hiseq

2000

Illumina HP

1.00 1.11 1.11

1.38

1.32

1.36

1.44

1.0

1.1

1.2

1.3

1.4

1.5 GS1 GS2 GS3 GS4 GS5 GS6

1 2 3

1.44

3

1 2.0 1.5

1

1.37

1

1.27

0

5

10

15

0-20- 40- 60- 80- 100-120-140-160-180-200-220-

GS6

0

5

10

15

0

5

10

15

0

5

10

15

0

5

10

15

0

5

10

15

0

3

6

9

12

60- 70- 80- 90- 100-110-120-130-140-150-160-

GS6

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

5

10

15

20 350- 400- 450- 500- 550- 600- 650- 700-

GS6

0

5

10

15

20

0

5

10

15

20

0

10

20

0

5

10

15

20

0

5

10

15

20

0

5

10

15

0-100- 200- 300- 400- 500- 600- 700- 800- 900-

GS6

0

5

10

15

0

5

10

15

0

5

10

15

0

5

10

15

0

5

10

15

0.728 63.7 %

44.3 %

0.550 28.6 %

0.306 8.0 %

0

3

6

9

12 8 9 10 11 12 13 14 15 16 17

GS6

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

6

12

18 2 3 4 5 6 7 8

GS6

0

6

12

18

0

6

12

18

0

6

12

18

0

6

12

18

0

6

12

18

0

4

8

12

16

0

4

8

12

16

0

4

8

12

16

0

4

8

12

16

0

4

8

12

16

0

4

8

12

16 17 18 19 20 21 22 23 24 25 26 27 28

GS6

0

3

6

9

12 16-18-20-22-24-26-28-30-32-34-36-38-

GS6

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0

3

6

9

12

0.053 23.4 %

_0.015 _0.011

8.3 %

-0.001

(12)

名古屋大集中講義 iwatawiki lec06 s

6

2016/11/30 13:00-14:30

• 2050

1.7

•

2050 90 …

Tester M, Langridge P(2010) Science 327: 818

SSA

•

1975

/ 292

1982

1984

Meuwissen et al. (2001) GeneGcs 157:1819

DNA 5

DNA

• DNA

hKp://www.illuminakk.co.jp/product/genoTYPING/bovinesnp50_beadchip.shtml

Illumina BovineSNP50 BeadChip

24 54,609 DNA

…

Garcia-Ruiz et al.

Proc Natl Acad Sci 113(28): E3995-4004

“The most drama8c response to

genomic selec8on was observed

for the lowly heritable traits

DPR, PL, and SCS. Gene8c trends

changed from close to zero to

large and favorable, resul8ng in

rapid gene8c improvement in

fer8lity, lifespan, and health in a

breed where these traits eroded

over 8me.”

(

, ,

)

Watanabe et al. (2005)

Ann Bot. 95:1131

…

DNA

DNA

DNA

LD

GS

DNA

y

: i

x

: i j

y

x 3

x 4

x 1

x 2

x K y = f (x 1 , x 2 ,, x K )

(y) DNA (x i )

f (x)

f (.)

y DNA x 1 ,...,x K

vs.

Watanabe et al. (2005)

Ann Bot. 95:1131

DNA

DNA

y i = µ + β j x ij + e i

j= 0

N

∑

...

DNA

Watanabe et al. (2005)

Ann Bot. 95:1131

DNA

1

hKp://www.soran.net/index_html/A0084070.htm

1 3

F1

6

GS GS

•  2050

• 

• 

•  ^DNA

_DNA

x ₃

x ₄

x ₁

x ₂

x _K ^{y =} ^{f (x} 1 ^, ^x 2 ^{,, x} ^K ⁾

(y) DNA (x _i )

y DNA x ₁ ,...,x _K

y i = µ + β _j x ij + e i

_x

y = _{f (x)}

y _i = µ + β _j x _ij + e _i

S _e