060310391 0560565
6
2015/11/6 13:00-14:45
@1 - 4
1
(associa4on analysis)
(associa4on study)
……T……C……A……G……T………A…………A………
•
DNA
……T……C……A……A……T………A…………A………
……T……A……G……G……T………A…………A………
……G……C……A……A……T………C…………T………
……T……A……A……G……T………A…………A………
……T……C……A……A……T………A…………A………
……T……C……A……G……T………A…………T………
……T……A……G……G……T………C…………A………
……G……C……A……G……C………A…………T………
……T……A……A……G……C………A…………A………
……T……C……A……G……C………C…………T………
……T……A……A……A……C………C…………A………
……T……C……G……G……C………A…………T………
……G……A……A……G……C………A…………A………
……T……C……A……A……C………C…………A………
A
B
C
:
: :
DNA
200
70/80
23/120
200
2
AD2 T( Ab N 1 73 7 0
0
0
0
T
0
C
•
• QTL
•
• GWAS
• χ
2• Fisher
•
•
• 1 2
•
3
• 3
AA, Aa, aa 3
30,000•
•
2005
4
QTL
• DNA
QTL
•
•
•
•
×
6
• DNA
SNP
•
•
•
ATCGAG TAGACT TATACG
ATCGAG TAGACT TATACG
ATCGAG TAGACA TATACG
ATCGAG TAGACT TATACG
ATCGAG TAGACT TATACG
ATCGAG TAGACT TATACG
ATCGAG TAGACA TATACG
ATCGAG TAGACA TATACG
ATCGAG TAGACT TATACG
ATCGAG TAGACA TATACG
7
(single nucleo4de polymorphisms: SNPs)
GeneChip Rice 44K SNP Genotyping Array
• 44,100 SNPs (10kb 1 SNP)
Tung et al. (2010) Rice 3:205-217 8
QTL
DNA
GWAS
DNA
Morrell et al. (2012) Nature Review Gene4cs 13:85
QTL vs.
10
QTL
Yes No
linkage disequilibrium: LD
11
QTL
(Modified from Balding 2006)
SNP
( )
SNP
( )
SNP
13
(Rafalski 2002
●B ●b
●A pAB pAb pA
●a paB pab pa
pB pb
r2= D
2
pApapBpb
(●● 1,●● 0
)
D = p
AB− p
Ap
B= p
ABp
ab− p
Abp
aBr2= 0.25
2
0.5× 0.5 × 0.5 × 0.5
= 1
AB
AB
AB
r2= 0.1024 r2= 0
A B
14
•
•
(Rafalski 2002 )
15
LD
Gupta et al. (2005)
•
•
16
LD
(Zhu et al. 2007)
LD
→ LD
↑
↓
17
AA
aa
AA aa
χ2 Fisher Fisher’s exact test
18
χ
2(R) (S)
AA f11 (11) f12 (4) f1. (15) aa f21 (3) f22 (7) f2. (10) f.1 (14) f.2 (11) n (25)
p(R) = f.1 / n = 14 / 25 = 0.56 p(S) = f.2 / n = 11 / 25 = 0.44 AA p(AA) = f1. / n = 15 / 25 = 0.60 aa p(aa) = f2. / n = 10 / 25 = 0.40
↓ R AA n × p(R) × p(AA) = 25 × 0.56 × 0.60 = 8.4 R aa n × p(R) × p(aa) = 25 × 0.56 × 0.40 = 5.6 S AA n × p(S) × p(AA) = 25 × 0.44 × 0.60 = 6.6 S aa n × p(S) × p(aa) = 25 × 0.44 × 0.40 = 4.4
χ2= (obs− exp)
2
∑
exp =(11− 8.4)28.4 +
(3− 5.6)2
5.6 +
(4− 6.6)2
6.6 +
(7− 4.4)2 4.4 = 4.57
(r-1)(c-1) χ2 r c
5%
χ0.012 (1) = 6.63 >χ2= 4.57 >χ0.052 (1) = 3.84
5
19
-4 -2 0 2 4 6 8
mQQ mqq
QTL 㑇ఏᏊᆺ
QQ qq
࣐ 勖
࢝ 勖 㑇 ఏ Ꮚ ᆺ
AA
aa
40 10
10 40
-4 -2 0 2 4 6 8
₯ᅾⓗ࡞ΰྜศᕸ
-4 -2 0 2 4 6 8
࣐࣮࣮࢝
⾲⌧ᆺศᕸ
mAA
-4 -2 0 2 4 6 8
maa
y i = u + β j x ij + e i
x y
-0.2 0.2 0.6 1.0
-20246
Marker genotype
Phenotype
i AA xi=2
aa xi=0
xi yi
N0 0
( ( CA
0 2 20
-3 -2 -1 0 1 2 3
455055
x
y
y
i= α + β x
i+ ε
i= y ˆ
i+ ε
iy
dependent (response) variable
independent (explanatory) variable
SNP
β
regression coefficient
ε
residuals yi
ˆ
y i=α+βxi
xi εi
α
intercept, constant term y
21
The method of least squares
ε
i= y
i− (α + βx
i)
( 2
SSE = εi 2
i n
∑
= (yi−α − βxi)2i n
∑
:
∂SSE
∂β =−2 i (yi−α − βxi)xi
n
∑
= 0∂SSE
∂α =−2 i (yi−α − βxi)
n
∑
= 0a = yi
i
∑
n n− b ixi∑
n n = y − bx b =xiyi−
i
∑
n ixi∑
n iyi∑
n nxi2− xi
i
∑
n( )
2 ni
∑
n =(xi− x )(yi− y )
i
∑
n(xi− x )2
i
∑
nn xi
i
∑
nxi
i
∑
n xi 2 i∑
n#
$
%
%
&
' ( ( a b
#
$ %
& ' ( = iyi
∑
nxiyi
i
∑
n#
$
%
%
&
' (
-3 -2 -1 0 1 2 3 (
455055
x
y
yi ˆ
y i=α+βxi
xi εi
22
(a b α β
y i = u + β j x ij + e i
1311
SNPs
All materials can be downloaded from hkp://ricediversity.org/
SNPs …
0e+00 1e+08 2e+08 3e+08
05102030
position (bp)
−log10(p)
LD
LD
Rafalski and Morgante 2004 Oraguzie et al. 2007) 25
p
… suppose that a would-be gene7cist set out to
study the “trait” of ability to eat with chops7cks in the
San Francisco popula7on by performing an
associa7on study with the HLA complex. The allele
HLA-A1 would turn out to be posi7vely associated
with ability to use chops7cks … because the allele
HLA-A1 is more common among Asians than
Caucasians.”
Lander and Schork (1994)
HLA Human Leukocyte Antigen
26
?
1
2
(Modified from Balding 2006)27
• Indica
Japonica
y i = u + β j x ij + v k q ik
k=1
K
∑ + e i
1.0
0.0 0.5
Bayesian
Structure Pritchard et al. (2000) Gene4cs 155:945– 959
4
6 K=6
2
qi1 = 0.78, qi3 = 0.32, 0
0e+00 1e+08 2e+08 3e+08
0510152025
position (bp)
−log10(p)
…
→ 31
A
• Yu et al. (2006) Nat. Genet. 38: 203
y i = u + β j x ij + v k q ik
k=1
K
∑ + α i + e i
a ~ N (0, Aσ α 2 )
var(α
i) = a(i, i) σ
α2cov( α
i, α
i') = a(i, i ') σ
α2→
0e+00 1e+08 2e+08 3e+08
01234
position (bp)
−log10(p)
1SNP
SNP
SNP
A A
B B
B A
34
B A
SNPs
SNP
A
A B
B A
35
B
Bayesian
QTL
y i = u + β j γ j x ij
j=1
J
∑ + v k q ik
k=1
K
∑ + α i + e i
• Iwata et al. (2007) Theor Appl Genet 114:1437–1449
γ
j= 1
0
! "
#
j SNP
j SNP
γ
j~ B(1, p
j)
0e+00 1e+08 2e+08 3e+08
0.00.20.40.60.81.0
position (bp)
Posterior prob. of QTL
Bayesian
→
γ j
…
Chr. 3
GS3
63 kb
id3008333
546 kb
id3008127
Chr. 5
qSW5
id5002699
0.95 1.0
66 kb 0.88
0e+00 1e+08 2e+08 3e+08
0.00.20.40.60.81.0
position (bp)
Posterior prob. of QTL
SNPs
GWAS & GS
GWAS
GS
GS
•
•
• NA
•
• NA
• AC 0
Q, K
• Q
– Structure (Pritchard et al. 2000)
– Price et al. 2006)
– Zhu and Yu 2009)
• K:
– Loiselle et al. (1995),
Ritland et al. (1996))
– Zhao et al. 2007
–
43
GWAS
Atwell et al. (2010) Nature 465: 627
→ a
45
A B (
C ( D
false posi4ve rate: FPR = B / (A + B)
false nega4ve rate: FNR = C / (C + D)
false discovery rate: FDR = B / (B + D)
46
• FDR: false discovery rate
1. K P
2. P
3. FDR q* 5%
i
4. 1 i
Benjamini and Hochberg (1995)
) ( )
2 ( ) 1
(
P P
KP ≤ ≤ ! ≤
K
P
iiq
* )
(
≤
• p
• K P
p K × P
•
P
(i)≤ iq
*
K ⇔ q
*
≥ KP
(i)i
• q* i
K × P
• q* 5%
GWAS
49
• GWAS Hd6
Yano et al. (2016) Nature Gene4cs 48: 927
Yano et al. (2016) Nature Gene4cs 48: 927 51 53
Yang et al. (2003) Am. J. Hum. Genet 73: 627
• ACTN3 α-c4nin-3
• X α-c4nin-3
• a-ac4nin-3 a-ac4nin-2 ACTN3
126,559 GWAS
• Fig. 1 3 SNPs
Rietveld et al. 2014, PNAS 13790, Ward et al. 2014 PLoS ONE e100248)
• R2 0.02%
• Fig. 2
54
Rietveld et al. (2013) Science 340: 1467
•
•
•
•
•
• DNA
QTL
55
•
•
QTL
•
• :
(2000/01)
• ISBN-10: 4130602063
• ISBN-13: 978-4130602068
56
hkp://www.amazon.co.jp/
6
1. 100 2
A, B AABB,
aaBB, AAbb, aabb 40, 10, 10, 40
AB D r2
2. AB
AB 1
57
χ
0.012(1) = 6.63, χ
0.052
(1) = 3.84
PDF MS-Word PDF
e
2016 11 14
58
Fisher
Fisher’s exact test
•
“ ”
•
•
59
R S
AA a c g
aa b d h
e f n
e, f, g, h P(e, f ,g,h) = n
e
"
# $ %
& ' peqf× n
g
"
# $ %
&
' rgsh= (n!)
2
e! f !g!h!p
eqfrgsh
p = e/n, q = f/n, r = g/n, s = h/n
a, b, c, d e, f, g, h
P(a,b,c, d e, f ,g,h) = p(a,b,c,d,e, f ,g,h) / p(e, f ,g,h) =e! f !g!h! n!
1 a!b!c!d!
e, f, g, h
a, b, c, d P(a,b,c, d,e, f ,g,h) = n
e
"
# $ %
& ' peqf × e
a
"
# $ %
& ' rasb× f
c
"
# $ %
& ' rcsd = n!
e! f ! e! a!b!
f ! c!d!p
eqfrgsh
60
P(B)
P(A ∩ B)
P(A | B) = P(A ∩ B) /P(B)
a,b,c,d
11 4
3 7
R S
AA 11 4 15
aa 3 7 10
14 11 25
px
6
px=14!11!15!10! 25!
1 11!3!4!7!
"
# $ %
& ' = 0.0367
Fisher
χ2
5%
12 3
2 8
13 2
1 9
14 1
0 10
10 5
4 6
9 6
5 5
8 7
6 4
7 8
7 3
6 9
8 2
5 10
9 1
4 11
10 0
p =14!11!15!10! 25!
1 11!3!4!7!+
1 12!2!3!8!+
1 13!1!2!9!+
1 14!0!1!10!+
1 5!9!10!1!+
1 4!10!11!0
"
# $ %
& '
= 0.0486
px
61
-3 -2 -1 0 1 2 3
455055
-3 -2 -1 0 1 2 3
455055
-3 -2 -1 0 1 2 3
455055
y
iˆ
y
iy
(yi− y )2
i
∑
n = ( ˆ y i− y ) 2 i∑
n + (yi− ˆ y i) 2 i∑
n(a)
SSY
(a)
(b)
(c)
(b)
SSR
(c)
SSE F
(b)
SSR 1 MSR =
SSR
MSR /MSE
(c)
SSE n - 2 MSE = SSE /(n-2)
(a)
SSY n - 1
F
y
iy
ˆ
y
i(
62
J = (yi−α − βxi)2
i=1
∑
Ny
i= α + β x
i+ ε
iε
i~ N(0, σ
2)
0 σ2
y
i~ N( α + β x
i, σ
2)
y α+βx σ2L = p(y1, y2,..., yN) = exp{−(yi−α − βxi)
2/2σ2
} 2πσ2
i=1 N
∏
=exp{− (yi−α − βxi)2/2σ2 i=1
∑
N }(2πσ2)N
0.00.10.20.30.4
α + βxi yi
p(yi) = 1 2πσ2exp−
(yi−α − βxi)2 2σ2 ' ( )
* + ,
ln L = − 1
2σ2 (yi−α − βxi)
2 i=1
∑
N −N2log2πσ2∂ln L
∂β =− 1
σ2 i(yi−α−βxi)xi
n
∑ = 0
∂ ln L
∂α =− 1
σ2 i(yi−α − βxi)
n
∑
= 00 σ2
63
8a