!"#$%&'()*+,
-./01"234
!"#$%
†! ! &'()
†† *+,-./01234526789:;<=5>?@1#ABCDEFGH<;IJKL,M./6NOPQR STLUVWX9YZ"S-./[=5>?@1#ABC\]^_<;IJ`a[bcd-./6ef[gIM1 hijklm9:;<=5>?@1#ABCno[gIpFqrsIJ AMAF Ft;Luv[b,0wxyz{|6}~•€F•‚9ƒ„^…I†R9‡ˆ‰,Š.j0123/9‹‰ < 92%Œ•Žs†RC[•LJKLw.>•52Ft;L0wxyz{6‘’sI†R[,‘’“9‹‰,53%Œ •Žs†RC[•LJMonte-Carlo Tree Search
in the Game of Blokus-Duo
Kenta Sasaki
†!
Yoshiyuki Kotani
††! !
Recently, Monte-Carlo Tree Search attracts attention in a field of the game programming. Moreover, Monte-Carlo Tree Search is studied by various games as well as the game of Go that became the instigator of the boom. This paper inspects it in Blokus-Duo whether Monte-Carlo Tree Search is effective.
In the experiment that used AMAF, it succeeded in a great increase of the update frequency after play-out, and it won the base program by 92%. In addition, by the experiment that used a rating for, it won the program before the improvement by 53%.
1.
567)
*+,-./01234526789:;<=5>? @1#ABCDEFGH<;IJ”6•–R‰<,AB R—˜™€Ft;Leš›Sœ•[gKž’;Ÿ F¡ ¢_<;SpTLUV9:;<••S‡ˆF£¤L†R Cg¥¢_I[1]J¦*[bUV§P[S¨,©ª«m¬ 1S-Z"S-./9:;<=5>?@1#ABFt; Lœ•C\]^_<;I[2][3]J `a[bcd-./6ef[gIM1hijklm9 ‹‰<=5>?@1#ABFt;I†R[®¨SIpq rF)¯JM1hijklmbGPCC 6°±R‰<e² +,³+R´•,µ+Y°±R‰<¶ž·¥¢_<;I ¸¹º»-./[gIJ†6-./6¼½R‰<,¾¿ œ€CÀÁ9Â;†RCg¥¢_IJÜ[b¾¿œ€ C414 Äžgž,€œÅ[b¾¿œ€C 700Æ800 ÄžF ÇÈI†RYÉÊ9:†IJ†6L¤,ef6Ëœ9‹ † ÌÍÎϕЕÐÑ ÏÐÒ º»ÏÐÓÔ!Department of Computer and Information Sciences, Graduate school of Technology, Tokyo University of Agriculture and Technology
††
ÌÍÎϕЕÐÑ ÏÐÒ
Department of Computer and Information Sciences, Tokyo University of Agriculture and Technology
sI0wxyz{€CÕ6-./9ÖרS¨SI†R p¢,oÙÚ¨=5>?@1#ABF)¯†RCef6 ° ± R ‰ < Û ¥ ¢ _ I J † 6 Ü ± F Ý Þ s I L ¤ 9 All-Moves-As-First ßl.àj>•hiáWâ AMAFãR ¼½6w.>•52Ft;<,0123/6wŠ@‘( FäåLJ
2.
-./01"23489:
=5>?@1#AB6‘’R‰<2.1 [b AMAF,2.2 [bBradley-Terry =k@9ÚIw.>•526æç,2.3 [bw.>•526M1hijklmè6ét9f;< êësIJ 2.1 AMAF gI)ìa 6˜íFîïsIðñ,òóSô•[b0 wxyz{6Ü[a Cõö_L¹÷øù.ú6£ûF üý‰<a 6˜íFîïsIJ†_9‹‰,a C0wxy z{6þÿ6!"[õö_L¹÷øù.ú6£ûFüý sIœ•Cgž,#5øl.$UV678[bAMAF R ‰<»%^_<;I[4]J†6œ•b)ì&F'(¾¿S103
-þÿ6789)t[•IR^_<:ž,M1hijkl m[b)ìFe*+_,ÈI†RC¾¿[gI†Rp¢
)tC¾¿[gI JKL,-LyxkyR‰< RAVE
(Rapid Action Value Estimate)CgI[5]J
2.2 Bradley-Terry -&1);,<=/>.?8@ A ! ./jS-6012-./9:PIu3îïô•R‰ <Elo w.>•52Ft;Læçô•CgIJ†6æç ô•6•4›569gLIY6[Bradley-Terry =k@C gž,†6=k@Ft;I†R[,gIäñ9™sIŒ 789F)¯†RC[•IJ ! ¼½R‰<,:€16äñF;¯†RC[•,KL, <1§P[S¨../Fd=sI†RY[•IJ>Èö, ../1-2-3,2-4,4-5 C;LR•,2-4 CŒf?Ùb, R@^_IJ†6=k@Ft;<,AËœ6w.>•5 2p¢,ËœB6Œ7F8CsI†RC¾¿RSIJ† 6w.>•526æçô•bR. Coulom 9ÚT<DE^ _L[6]J 2.3 !"#$%&'(B8CD ! w.>•526æç9b,æçsIL¤9FtsIª GCHIRSIJUV«©ª6ðñ,01ªJ6ªGF FtsI†R[’;w.>•526íF¡I†RC[• IC,M1hijklm9b01ªJbKL‰S;J` a[b,†6ܱF•MsIL¤9,UCT y@NàO/ [7]FFt‰L=5>?@10123/9ÚT<ªGFP ‡‰,QRk.$R‰LJGPCC ST6M1hijkl
m•U[ UCT y@NàO/9ÚI0123/C’;‡
V[gTL†Rp¢,wŠ@CWX^_LªG§RdÈ IJ ! Y¤LªG40 ZFFt‰,w.>•526æçF)T LJWâ6Ú¯S¼½FFt‰LJ [ \F]f˜í 21 ^_ [ A`jEFRI˜í 196 < †6ñæ217 <6¼½Fabá1Æ8 œEã,cbá9Æ16 œEã,dbá16 œEWeã6ff9ghTL,ñæ 651 <6¼½9f;<æç‰LJ w.>•526æçŸ 6 e*Fi1,2,3 9jsJ i.1 \6w.>•52 i.2 \kl i.3 ab9:PIm(˜í
3.
EF
3.1 [b AMAF Ft;Luv,3.2 [bw.>•52F t;Luv,3.3 [b AMAF Rw.>•52Fnåño …LuvF)¯J 3.1 AMAF GDHIEF AMAF 6p¿FÖqsIL¤90wxyz{}~•€ 6Öq,‘’“R‘’|[‹ZuvF)TLJ 3.1.1 J<KLMNOPQR8ST ! ur9ZsFtÈ,AMAF FFt‰LY6RuFt6 Y6R[0wxyz{}~•€6ÖqF)TLJvwB xLž60wxyz{}~•€[email protected] 9jsJ @.1 0wxyz{}~•€6Öq ZsA Zs B Zs C uFt 168 298 872104
-Ft 436 1606 5769 ! ! Zs-A! Zs-B! ! ! ! ! Zs-C †6Ÿ Úž,œ€Cyz§Zs{-Ëœ6+_,È C¾¿SðñCÂ;†RC7pIJ 3.1.2 UVEF
AMAF 6o F|×IL¤9,3.1.1 6uv!6}~[ AMAF Ft‰LY6RuFt6Y6F‹Z^…LJ eœ10 w,ñæ 100 ‹Z)TLá••7Pb 0.5 ŒãJ Ÿ b,Ft‰LY6C92 ŒR••¨Œ•މLJ†6 †Rp¢M1hijklm9:;<,AMAF bno[g I†RCopIJ 3.2 <=/>.?GDHIEF ! æç‰Lw.>•526p¿—˜,0wxyz{9n å€z§ðñ6p¿—˜F)TLJ 3.2.1 <=/>.?8WXYZ æç‰Lw.>•526íC-6•~6®^[gIp |×IL¤9‚€ËœR‹Z^…<p¿F|×LJw. >•52Ft;LËœ6õƒô•R‰<0^_6ô•[ uvF)TLJ ávã@.wh{„…›Ëœõƒô•[,¹¾¿œN < 9‹‰<,gI¾¿œi Cõƒ^_I?Ùb, R@s†RC[•IJ á†ã35‡/90f6œFõH,w.>•52Cˆ; œFõƒsIJ †60f6Ëœô•Ft;<‚€ËœR 1,000 ‹Z) TLá••7Pb0.5 ŒãJŸ [email protected] 9jsJ @.2 w.>•526p¿Öq ! ! "#$%&'() *+! "#$%, -.. /0 123456! 789&:) $%! .;<=>! ?@;A! 7B9&:) $%! .;A@?! ?<;?! 3.2.2 J<KLMNB8<=/>.?8[D 3.2.1 [j‰L0^_6Ëœõƒô•F0wxyz{! 6Ëœõƒ9ét‰,0wxyz{6‘’F)TLJ0 wxyz{6‘’“R‹Z^…I†R[®^C‰Š‰L pF|×LJ eœ10 w,ñæ 100 ‹Z)TLá••7Pb 0.5 ŒãJ Ÿ [email protected] 9jsJ @.3 0wxyz{‘’9ÚIŒÙ ! CDE&'()*+! 789&:)$%! .;@>! 7B9&:)$%! .;@-! 3.3 AMAF \<=/>.?8]^_`a AMAF 9ÚI0wxyz{6}~R,w.>•52F t;L0wxyz{6‘’6nåño…FdÈIJ nå ño…ô•R‰<Wâ60^_6ô•Fä‰LJ ‹Œ! 3.1 [j‰LòóS AMAF R 3.2 [j‰L‘’‰ L0wxyz{Fnåño…
•Œ! AMAF 9ÚT<0wxyz{Ÿ F}~sIð
ñ,Ž••xO6ø.j6厑9}~F)¯Ú¯9‰
LY6á‘’AMAFãR 3.2 [j‰L‘’‰L0wxy
z{Fnåño…
†_¢6nåño…R,òóSAMAF R‚€0wxy
z{Fnåño…LY6R‹Z^…I†R[®^C‰Š ‰LpF|×LJ
-eœ10 w,ñæ 100 ‹Z)TLá••7Pb 0.5 ŒãJ Ÿ [email protected] 9jsJ @.4 AMAF Rw.>•526nåño… ! ! 7FG HIHJ!K! "# LM9 &'()*+! FG HIHJ!K! 789&:) LM! .;>@@! FG HIHJ!K! 7B9&:) LM! .;>N! CD HIHJ!K! 789&:) LM! .;@N! CD HIHJ!K! 7B9&:) LM! .;>?! ! òóSAMAF R‘’‰L0wxyz{6nåño…b, ‚€0wxyz{Ft;Lðñ9Ö×’IŸ RSTLJ ‘’‰LAMAF Ft;LðñRávã60wxyz{F t;Lðñ6匕Žs†RC[•LJ
4.
*`b)
! `a[b,M1hijklm9:PI=5>?@1# ABF)TLJUV[‡ˆ‰L;¨fp6œ•bM1h ijklm9Y)t[•I†RC7pTLJ ! AMAF Ft;Luv[b,t;SpTLY69‹‰< 92%R••¨Œ•މL†Rp¢,M1hijklm9 :;<AMAF Ft;I†Rbno[gIR;ÈIJ eô[,AMAF Rw.>•52Rnåño…Lðñb ¦Y’;Y6[53%[gTLJgKžŒÙC·C¢Sp TL“”R‰<,¼½6¶žôCgKž’¨SpTL, ¼½6€CØSpTL,AMAF R6nåño…F^¢9 Ï•sIHICgTL†RS-CdÈ¢_IJcdef
1) –—e˜: =5>?@1#AB™#5øl.$UV 9š›Fœ†‰L~œ•,º»••,Vol.49,pp.686-693, 2008. 2) !žŸ , ˆ¡•¢: =5>?@1#AB9ÚI# 5øl.$©ª,£13 •-./0123452¤.i¥ ¦h0, pp.1-8, 2008. 3) •§¨©, ª“e«, ¬-®¯, &'(): TDMC(°)95±¨—˜™€6|²,£ 13 •-./01 23452¤.i¥¦h0, pp.73-79, 2008.4) B. Brügmann: Monte-Carlo Go. http://ideanest.com/vegos/MonteCarloGo.pdf 5) S. Gelly, D. Silver: Combining online and offline knowledge in uct, In ICML ’07: Proceedings of the 24th international conference on Machine learning, pp 273-280, New york, NY, USA, 2007. ACM.
6) R. Coulom: Computing elo ratings of move patterns in the Game of Go,In Computer Games Workshop, Amsterdam / The Netherlands, 2007.
7) L. Kocsis, C. Szepesvári: Bandit Based Monte-Carlo Planning,Proceedings of the 15th European Conference on Machine Learning (ECML), pp.282-293, 2006.