• 検索結果がありません。

GWASとGS iwatawiki lec022 s

N/A
N/A
Protected

Academic year: 2018

シェア "GWASとGS iwatawiki lec022 s"

Copied!
29
0
0

読み込み中.... (全文を見る)

全文

(1)

2017/11/24

2

1

(associa.on analysis)

(associa.on study)

……T……C……A……G……T………A…………A………

• 

DNA

……T……C……A……A……T………A…………A………

……T……A……G……G……T………A…………A………

……G……C……A……A……T………C…………T………

……T……A……A……G……T………A…………A………

……T……C……A……A……T………A…………A………

……T……C……A……G……T………A…………T………

……T……A……G……G……T………C…………A………

……G……C……A……G……C………A…………T………

……T……A……A……G……C………A…………A………

……T……C……A……G……C………C…………T………

……T……A……A……A……C………C…………A………

……T……C……G……G……C………A…………T………

……G……A……A……G……C………A…………A………

……T……C……A……A……C………C…………A………

A

B

C

:

: :

DNA

200

70/80

23/120

200

2

0 7 1 A

T

T

T

T

T

C

• 

•  QTL

• 

•  GWAS

•  χ

2

•  Fisher

• 

• 

•  1 2

• 

3

•  3

AA, Aa, aa 3

30,000

• 

• 

2005

4

(2)

Pictured by Dr. Satoshi Niikura

5

QTL

DNA

QTL

• 

• 

• 

• 

×

6

DNA

SNP

• 

• 

• 

ATCGAG TAGACT TATACG

ATCGAG TAGACT TATACG

ATCGAG TAGACA TATACG

ATCGAG TAGACT TATACG

ATCGAG TAGACT TATACG

ATCGAG TAGACT TATACG

ATCGAG TAGACA TATACG

ATCGAG TAGACA TATACG

ATCGAG TAGACT TATACG

ATCGAG TAGACA TATACG

7

(single nucleo.de polymorphisms: SNPs)

GeneChip Rice 44K SNP Genotyping Array

•  44,100 SNPs (10kb 1 SNP)

Tung et al. (2010) Rice 3:205-217

8

(3)

QTL

DNA

GWAS

DNA

Morrell et al. (2012) Nature Review Gene.cs 13:85

QTL vs.

linkage disequilibrium: LD

10

linkage disequilibrium: LD

QTL

(Modified from Balding 2006)

SNP

( )

SNP

( )

SNP

12

(4)

(Rafalski 2002

B b

A pAB pAb pA

a paB pab pa

pB pb

r2= D

2

pApapBpb

(●● 1,●● 0

)

D = p

AB

− p

A

p

B

= p

AB

p

ab

− p

Ab

p

aB

r2= 0.25

2

0.5× 0.5 × 0.5 × 0.5

= 1

AB

AB

AB

r2= 0.1024 r2= 0

A B

13

• 

• 

(Rafalski 2002 )

14

LD

Gupta et al. (2005)

• 

• 

15

AA

aa

AA aa

χ2 Fisher Fisher’s exact test

16

(5)

χ

2

(R) (S)

AA f11 (11) f12 (4) f1. (15) aa f21 (3) f22 (7) f2. (10) f.1 (14) f.2 (11) n (25)

p(R) = f.1 / n = 14 / 25 = 0.56 p(S) = f.2 / n = 11 / 25 = 0.44 AA p(AA) = f1. / n = 15 / 25 = 0.60 aa p(aa) = f2. / n = 10 / 25 = 0.40

R AA n × p(R) × p(AA) = 25 × 0.56 × 0.60 = 8.4 R aa n × p(R) × p(aa) = 25 × 0.56 × 0.40 = 5.6 S AA n × p(S) × p(AA) = 25 × 0.44 × 0.60 = 6.6 S aa n × p(S) × p(aa) = 25 × 0.44 × 0.40 = 4.4

χ2= (obs− exp)

2

exp =(11− 8.4)2

8.4 +

(3− 5.6)2 5.6 +

(4− 6.6)2 6.6 +

(7− 4.4)2 4.4 = 4.57

(r-1)(c-1) χ2 r c

5%

χ0.01

2 (1) = 6.63 >χ2= 4.57 >χ0.05 2 (1) = 3.84

5

17

:

-4 -2 0 2 4 6 8

mQQ mqq

QTL 㑇ఏᏊᆺ

QQ qq

AA

aa

40 10

10 40

-4 -2 0 2 4 6 8

₯ᅾⓗ࡞ΰྜศᕸ

-4 -2 0 2 4 6 8

࣐࣮࣮࢝

⾲⌧ᆺศᕸ

mAA

-4 -2 0 2 4 6 8

maa

y i = u + β j x ij + e i

x y

-0.2 0.2 0.6 1.0

-20246

Marker genotype

Phenotype

i AA xi=2

aa xi=0

xi yi

2 D C N

83

0 2 18

-3 -2 -1 0 1 2 3

455055

x

y

y

i

= α + β x

i

+ ε

i

= y ˆ

i

+ ε

i

y

dependent (response) variable

independent (explanatory) variable

SNP

β

regression coefficient

ε

residuals yi

ˆ

y i=α+βxi

xi εi

α

intercept, constant term y

19

The method of least squares

ε

i

= y

i

− (α + βx

i

)

( 2

SSE = εi2

i n

= (yiα − βxi) 2 i

n

:

∂SSE

∂β =−2 i (yiα − βxi)xi

n

= 0

∂SSE

∂α =−2 i (yiα − βxi)

n

= 0

a = yi

i

n n− b ixi

n n = y − bx b =

xiyi

i

n ixi

n iyi

n n

xi2xi

i

n

( )

2 n

i

n

= i(xi− x )(yi− y )

n

(xi− x )2

i

n

n xi

i

n

xi

i

n xi 2 i

n

#

$

%

%

&

' ( ( a b

#

$ % & ' ( = iyi

n

xiyi

i

n

#

$

%

%

&

' (

-3 -2 -1 0 1 2 3 (

455055

x

y

yi ˆ

y i=α+βxi

xi εi

20

(a b α β

(6)

y i = u + β j x ij + e i

1311

SNPs

All materials can be downloaded from hfp://ricediversity.org/

SNPs …

0e+00 1e+08 2e+08 3e+08

05102030

position (bp)

−log10(p)

LD

LD

Rafalski and Morgante 2004 Oraguzie et al. 2007) 23

:

p 

… suppose that a would-be gene7cist set out to

study the “trait” of ability to eat with chops7cks in the

San Francisco popula7on by performing an

associa7on study with the HLA complex. The allele

HLA-A1 would turn out to be posi7vely associated

with ability to use chops7cks … because the allele

HLA-A1 is more common among Asians than

Caucasians.”

Lander and Schork (1994)

HLA Human Leukocyte Antigen

24

(7)

?

1

2

(Modified from Balding 2006)25

•  Indica

Japonica

y i = u + β j x ij + v k q ik

k=1

K

+ e i

1.0

0.0 0.5

Bayesian

Structure Pritchard et al. (2000) Gene.cs 155:945– 959

4

6 K=6

2

qi1 = 0.78, qi3 = 0.32, 0

0e+00 1e+08 2e+08 3e+08

0510152025

position (bp)

−log10(p)

(8)

29

A

•  Yu et al. (2006) Nat. Genet. 38: 203

y i = u + β j x ij + v k q ik

k=1

K

+ α i + e i

a ~ N (0, Aσ α 2 )

var(α

i

) = a(i, i)σ

α2

cov(α

i

i'

) = a(i, i ') σ

α2

0e+00 1e+08 2e+08 3e+08

01234

position (bp)

−log10(p)

1SNP

SNP

SNP

A A

B B

B A

32

B A

(9)

SNPs

SNP

A

A B

B A

33

B

Bayesian

QTL

y i = u + β j γ j x ij

j=1

J

+ v k q ik

k=1

K

+ α i + e i

•  Iwata et al. (2007) Theor Appl Genet 114:1437–1449

γ

j

= 1

0

! "

#

j SNP

j SNP

γ

j

~ B(1, p

j

)

0e+00 1e+08 2e+08 3e+08

0.00.20.40.60.81.0

position (bp)

Posterior prob. of QTL

Bayesian

γ j

Chr. 3

GS3

63 kb

id3008333

546 kb

id3008127

Chr. 5

qSW5

id5002699

1.0 0.95

66 kb 0.88

0e+00 1e+08 2e+08 3e+08

0.00.20.40.60.81.0

position (bp)

Posterior prob. of QTL

SNPs

(10)

Q, K

•  Q

–  Structure (Pritchard et al. 2000)

–  Price et al. 2006)

–  Zhu and Yu 2009)

•  K:

–  Loiselle et al. (1995),

Ritland et al. (1996))

–  Zhao et al. 2007

– 

37

GWAS

Atwell et al. (2010) Nature 465: 627

→ a

39

A B (

C ( D

false posi.ve rate: FPR = B / (A + B)

false nega.ve rate: FNR = C / (C + D)

false discovery rate: FDR = B / (B + D)

40

(11)

•  FDR: false discovery rate

1.  K P

2.  P

3.  FDR q* 5%

i

4.  1 i

Benjamini and Hochberg (1995)



) ( )

2 ( ) 1

(

P P

K

P ≤ ≤ !

K

P

i

iq

*

)

(

•  p

•  K P

p K × P

• 

P

(i)

iq

*

K ⇔ q

*

KP

(i)

i

•  q* i

K × P

•  q* 5%

GWAS

43

•  GWAS Hd6

Yano et al. (2016) Nature Gene.cs 48: 927

Yano et al. (2016) Nature Gene.cs 48: 927 44

(12)

Yano et al. (2016) Nature Gene.cs 48: 927 45

46

47

Yang et al. (2003) Am. J. Hum. Genet 73: 627

•  ACTN3 α-c.nin-3

•  X α-c.nin-3

•  a-ac.nin-3 a-ac.nin-2 ACTN3

126,559 GWAS

•  Fig. 1 3 SNPs

Rietveld et al. 2014, PNAS 13790, Ward et al. 2014 PLoS ONE e100248)

•  R2 0.02%

•  Fig. 2

48

Rietveld et al. (2013) Science 340: 1467

(13)

• 

• 

• 

• 

• 

•  DNA

QTL

49

50

•  2050

1.7

• 

2050 90

Tester M, Langridge P(2010) Science 327: 818

• 

1975

/ 292

1982

1984

(14)

Meuwissen et al. (2001) Gene.cs 157:1819

DNA 5

Garcia-Ruiz et al.

Proc Natl Acad Sci 113(28): E3995-4004

“The most drama7c response to genomic selec7on was observed for the lowly heritable traits DPR, PL, and SCS. Gene7c trends changed from close to zero to large and favorable, resul7ng in rapid gene7c improvement in fer7lity, lifespan, and health in a breed where these traits eroded over 7me.”

(

, ,

)

(15)

Watanabe et al. (2005) Ann Bot. 95:1131

DNA

DNA

DNA

LD

GS

DNA

yi: i

xij: i j

y

x

3

x

4

x

1

x

2

x

K

y = f (x

1

, x

2

,, x

K

)

(y) DNA (x

i

)

f (x)

f (.)

y DNA x

1

,...,x

K

(16)

vs.

Watanabe et al. (2005) Ann Bot. 95:1131

DNA

DNA

yi=µ + βjxij+ ei

j= 0 N

...

DNA

vs.

yi=µ + βjxij+ ei j= 0

N

vs.

yi=µ + βjxij+ ei

j= 0 N

hfp://www.soran.net/index_html/A0084070.htm

1 3

F1

6

GS GS

GS GS

3

(17)

Marker assisted selec.on: MAS

×

QTL QTL

MAS

• 

QTL

–  QTL QTL

•  QTL

–  QTL

–  QTL

GS

•  QTL

→ QTL

x

y = f (x)

\\\

x

y

273 142

304 423

173 373 234 138 203133 223

y = f (x)

(18)

GWAS GS

GS model

GWAS model

“large p small n” problem

p >> n

x, y 1

60K …

y

i

= µ + β

j

x

ij

+ e

i

j=0 J

PRESS

Se

GS

0 2 4 6 8 10

024681012

0 2 4 6 8 10

024681012

x

y

0 2 4 6 8 10

024681012

x b b y= 0+ 1

=

+

=

7

1 0

k k kx

b b y

(n-fold cross-validation)

1. n

2.  i

3.   i 2

4.   2, 3 n

5. 

2

PRESS

n leave-one-out

y ˆ

i

y

i

y ˆ

i

y ˆ

i

y

i

(19)

0e+00 1e+08 2e+08 3e+08

01234

GWAS result

Position (bp)

log10(p)

SNPs

x

212

x

468

x

985

x

276

x

647

S: SNPs

y i = x ij b j

j∈S

+ e i

● ●

6 7 8 9 10 11 12

78910

y.obs

y.pred

MLR

r 2 = 0.36

argmin

b

(E) = argmin

b

(y

i

− x

iT

b )

2

i

+ λ b

2

#

$ % &

' (

y i = x ij b j

j

J

+ e i = x i T b + e i

b

2

= b

j2

j J

6 7 8 9 10 11 12

7.58.08.59.09.5

y.obs

y.pred

r 2 = 0.50

MLR

Δr 2 = +0.14

(20)

LASSO

argmin

b

(y

i

− x

iT

b )

2

i

+ λ b

1

#

$ % &

' (

y

i

= x

ij

b

j

j J

+ e

i

= x

iT

b + e

i

LASSO:

b

1

= b

j

j J

-2 -1 0 1 2

0.00.51.01.52.0

x

y

6 7 8 9 10 11 12

7.07.58.08.59.09.510.0

y.obs

y.pred

LASSO

r 2 = 0.54

MLR

Δr 2 = +0.18

ridge

0e+00 1e+08 2e+08 3e+08

0.040.000.02

Ridge

Position (bp)

Coefficients

0e+00 1e+08 2e+08 3e+08

0.20.00.10.2

LASSO

Position (bp)

Coefficients

Ridge, LASSO

Ridge LASSO

0 shrink

GS3

LASSO

GS3

Ridge polygene LASSO

QTL

Elas.c net

argmin

b

(y

i

− x

i T

b)

2 i

+ λ( 1 −α

2

b

2

+ α b

1

)

#

$ % &

' (

y

i

= x

ij

b

j

j J

+ e

i

= x

iT

b + e

i

b

1

= b

j

j J

b

2

= b

j 2 j J

α = 1: lasso

α = 0: ridge

参照

関連したドキュメント

The main aim of the present work is to develop a unified approach for investigating problems related to the uniform G σ Gevrey regularity of solutions to PDE on the whole space R n

In the case of single crystal plasticity, the relative rotation rate of lattice directors with respect to material lines is derived in a unique way from the kinematics of plastic

These provide many new examples of each of the following: generalized quadrangles (GQ) with order (q 2 , q) having subquadrangles of order (q, q ); ovals in PG(2, q); flocks of

Starting out with the balances of particle number density, spin and energy - momentum, Ein- stein‘s field equations and the relativistic dissipation inequality we consider

Therefore Corollary 2.3 tells us that only the dihedral quandle is useful in Alexander quandles of prime order for the study of quandle cocycle invariants of 1-knots and 2-knots..

[r]

“Breuil-M´ezard conjecture and modularity lifting for potentially semistable deformations after

Finally, as a corollary Theorem 4.7 and Proposition 4.9, we obtain the relative birational version of the Grothendieck Conjecture for smooth curves over subfields of finitely