• 検索結果がありません。

ブロックスデュオにおけるモンテカルロ木探索

N/A
N/A
Protected

Academic year: 2021

シェア "ブロックスデュオにおけるモンテカルロ木探索"

Copied!
4
0
0

読み込み中.... (全文を見る)

全文

(1)

!"#$%&'()*+,

-./01"234

!"#$%

! ! &'()

†† *+,-./01234526789:;<=5>?@1#ABCDEFGH<;IJKL,M./6NOPQR STLUVWX9YZ"S-./[=5>?@1#ABC\]^_<;IJ`a[bcd-./6ef[gIM1 hijklm9:;<=5>?@1#ABCno[gIpFqrsIJ AMAF Ft;Luv[b,0wxyz{|6}~•€F•‚9ƒ„^…I†R9‡ˆ‰,Š.j0123/9‹‰ < 92%Œ•Žs†RC[•LJKLw.>•52Ft;L0wxyz{6‘’sI†R[,‘’“9‹‰,53%Œ •Žs†RC[•LJ

Monte-Carlo Tree Search

in the Game of Blokus-Duo

Kenta Sasaki

!

Yoshiyuki Kotani

††

! !

Recently, Monte-Carlo Tree Search attracts attention in a field of the game programming. Moreover, Monte-Carlo Tree Search is studied by various games as well as the game of Go that became the instigator of the boom. This paper inspects it in Blokus-Duo whether Monte-Carlo Tree Search is effective.

In the experiment that used AMAF, it succeeded in a great increase of the update frequency after play-out, and it won the base program by 92%. In addition, by the experiment that used a rating for, it won the program before the improvement by 53%.

1.

567)

*+,-./01234526789:;<=5>? @1#ABCDEFGH<;IJ”6•–R‰<,AB R—˜™€Ft;Leš›Sœ•[gKž’;Ÿ F¡ ¢_<;SpTLUV9:;<••S‡ˆF£¤L†R Cg¥¢_I[1]J¦*[bUV§P[S¨,©ª«m¬ 1S-Z"S-./9:;<=5>?@1#ABFt; Lœ•C\]^_<;I[2][3]J `a[bcd-./6ef[gIM1hijklm9 ‹‰<=5>?@1#ABFt;I†R[®¨SIpq rF)¯JM1hijklmbGPCC 6°±R‰<e² +,³+R´•,µ+Y°±R‰<¶ž·¥¢_<;I ¸¹º»-./[gIJ†6-./6¼½R‰<,¾¿ œ€CÀÁ9Â;†RCg¥¢_IJÜ[b¾¿œ€ C414 Äžgž,€œÅ[b¾¿œ€C 700Æ800 ÄžF ÇÈI†RYÉÊ9:†IJ†6L¤,ef6Ëœ9‹ † ÌÍÎϕЕÐÑ ÏÐÒ º»ÏÐÓÔ!

Department of Computer and Information Sciences, Graduate school of Technology, Tokyo University of Agriculture and Technology

††

ÌÍÎϕЕÐÑ ÏÐÒ

Department of Computer and Information Sciences, Tokyo University of Agriculture and Technology

sI0wxyz{€CÕ6-./9ÖרS¨SI†R p¢,oÙÚ¨=5>?@1#ABF)¯†RCef6 ° ± R ‰ < Û ¥ ¢ _ I J † 6 Ü ± F Ý Þ s I L ¤ 9 All-Moves-As-First ßl.àj>•hiáWâ AMAFãR ¼½6w.>•52Ft;<,0123/6wŠ@‘( FäåLJ

2.

-./01"23489:

=5>?@1#AB6‘’R‰<2.1 [b AMAF,2.2 [bBradley-Terry =k@9ÚIw.>•526æç,2.3 [bw.>•526M1hijklmè6ét9f;< êësIJ 2.1 AMAF gI)ìa 6˜íFîïsIðñ,òóSô•[b0 wxyz{6Ü[a Cõö_L¹÷øù.ú6£ûF üý‰<a 6˜íFîïsIJ†_9‹‰,a C0wxy z{6þÿ6!"[õö_L¹÷øù.ú6£ûFüý sIœ•Cgž,#5øl.$UV678[bAMAF R ‰<»%^_<;I[4]J†6œ•b)ì&F'(¾¿S

103

(2)

-þÿ6789)t[•IR^_<:ž,M1hijkl m[b)ìFe*+_,ÈI†RC¾¿[gI†Rp¢

)tC¾¿[gI JKL,-LyxkyR‰< RAVE

(Rapid Action Value Estimate)CgI[5]J

2.2 Bradley-Terry -&1);,<=/>.?8@ A ! ./jS-6012-./9:PIu3îïô•R‰ <Elo w.>•52Ft;Læçô•CgIJ†6æç ô•6•4›569gLIY6[Bradley-Terry =k@C gž,†6=k@Ft;I†R[,gIäñ9™sIŒ 789F)¯†RC[•IJ ! ¼½R‰<,:€16äñF;¯†RC[•,KL, <1§P[S¨../Fd=sI†RY[•IJ>Èö, ../1-2-3,2-4,4-5 C;LR•,2-4 CŒf?Ùb, R@^_IJ†6=k@Ft;<,AËœ6w.>•5 2p¢,ËœB6Œ7F8CsI†RC¾¿RSIJ† 6w.>•526æçô•bR. Coulom 9ÚT<DE^ _L[6]J 2.3 !"#$%&'(B8CD ! w.>•526æç9b,æçsIL¤9FtsIª GCHIRSIJUV«©ª6ðñ,01ªJ6ªGF FtsI†R[’;w.>•526íF¡I†RC[• IC,M1hijklm9b01ªJbKL‰S;J` a[b,†6ܱF•MsIL¤9,UCT y@NàO/ [7]FFt‰L=5>?@10123/9ÚT<ªGFP ‡‰,QRk.$R‰LJGPCC ST6M1hijkl

m•U[ UCT y@NàO/9ÚI0123/C’;‡

V[gTL†Rp¢,wŠ@CWX^_LªG§RdÈ IJ ! Y¤LªG40 ZFFt‰,w.>•526æçF)T LJWâ6Ú¯S¼½FFt‰LJ [ \F]f˜í 21 ^_ [ A`jEFRI˜í 196 < †6ñæ217 <6¼½Fabá1Æ8 œEã,cbá9Æ16 œEã,dbá16 œEWeã6ff9ghTL,ñæ 651 <6¼½9f;<æç‰LJ w.>•526æçŸ 6 e*Fi1,2,3 9jsJ i.1 \6w.>•52 i.2 \kl i.3 ab9:PIm(˜í

3.

EF

3.1 [b AMAF Ft;Luv,3.2 [bw.>•52F t;Luv,3.3 [b AMAF Rw.>•52Fnåño …LuvF)¯J 3.1 AMAF GDHIEF AMAF 6p¿FÖqsIL¤90wxyz{}~•€ 6Öq,‘’“R‘’|[‹ZuvF)TLJ 3.1.1 J<KLMNOPQR8ST ! ur9ZsFtÈ,AMAF FFt‰LY6RuFt6 Y6R[0wxyz{}~•€6ÖqF)TLJvwB xLž60wxyz{}~•€[email protected] 9jsJ @.1 0wxyz{}~•€6Öq ZsA Zs B Zs C uFt 168 298 872

104

(3)

-Ft 436 1606 5769 ! ! Zs-A! Zs-B! ! ! ! ! Zs-C †6Ÿ Úž,œ€Cyz§Zs{-Ëœ6+_,È C¾¿SðñCÂ;†RC7pIJ 3.1.2 UVEF

AMAF 6o F|×IL¤9,3.1.1 6uv!6}~[ AMAF Ft‰LY6RuFt6Y6F‹Z^…LJ eœ10 w,ñæ 100 ‹Z)TLá••7Pb 0.5 ŒãJ Ÿ b,Ft‰LY6C92 ŒR••¨Œ•މLJ†6 †Rp¢M1hijklm9:;<,AMAF bno[g I†RCopIJ 3.2 <=/>.?GDHIEF ! æç‰Lw.>•526p¿—˜,0wxyz{9n å€z§ðñ6p¿—˜F)TLJ 3.2.1 <=/>.?8WXYZ æç‰Lw.>•526íC-6•~6®^[gIp |×IL¤9‚€ËœR‹Z^…<p¿F|×LJw. >•52Ft;LËœ6õƒô•R‰<0^_6ô•[ uvF)TLJ ávã@.wh{„…›Ëœõƒô•[,¹¾¿œN < 9‹‰<,gI¾¿œi Cõƒ^_I?Ùb, R@s†RC[•IJ á†ã35‡/90f6œFõH,w.>•52Cˆ; œFõƒsIJ †60f6Ëœô•Ft;<‚€ËœR 1,000 ‹Z) TLá••7Pb0.5 ŒãJŸ [email protected] 9jsJ @.2 w.>•526p¿Öq ! ! "#$%&'() *+! "#$%, -.. /0 123456! 789&:) $%! .;<=>! ?@;A! 7B9&:) $%! .;A@?! ?<;?! 3.2.2 J<KLMNB8<=/>.?8[D 3.2.1 [j‰L0^_6Ëœõƒô•F0wxyz{! 6Ëœõƒ9ét‰,0wxyz{6‘’F)TLJ0 wxyz{6‘’“R‹Z^…I†R[®^C‰Š‰L pF|×LJ eœ10 w,ñæ 100 ‹Z)TLá••7Pb 0.5 ŒãJ Ÿ [email protected] 9jsJ @.3 0wxyz{‘’9ÚIŒÙ ! CDE&'()*+! 789&:)$%! .;@>! 7B9&:)$%! .;@-! 3.3 AMAF \<=/>.?8]^_`a AMAF 9ÚI0wxyz{6}~R,w.>•52F t;L0wxyz{6‘’6nåño…FdÈIJ nå ño…ô•R‰<Wâ60^_6ô•Fä‰LJ ‹Œ! 3.1 [j‰LòóS AMAF R 3.2 [j‰L‘’‰ L0wxyz{Fnåño…

•Œ! AMAF 9ÚT<0wxyz{Ÿ F}~sIð

ñ,Ž••xO6ø.j6厑9}~F)¯Ú¯9‰

LY6á‘’AMAFãR 3.2 [j‰L‘’‰L0wxy

z{Fnåño…

†_¢6nåño…R,òóSAMAF R‚€0wxy

z{Fnåño…LY6R‹Z^…I†R[®^C‰Š ‰LpF|×LJ

(4)

-eœ10 w,ñæ 100 ‹Z)TLá••7Pb 0.5 ŒãJ Ÿ [email protected] 9jsJ @.4 AMAF Rw.>•526nåño… ! ! 7FG HIHJ!K! "# LM9 &'()*+! FG HIHJ!K! 789&:) LM! .;>@@! FG HIHJ!K! 7B9&:) LM! .;>N! CD HIHJ!K! 789&:) LM! .;@N! CD HIHJ!K! 7B9&:) LM! .;>?! ! òóSAMAF R‘’‰L0wxyz{6nåño…b, ‚€0wxyz{Ft;Lðñ9Ö×’IŸ RSTLJ ‘’‰LAMAF Ft;LðñRávã60wxyz{F t;Lðñ6匕Žs†RC[•LJ

4.

*`b)

! `a[b,M1hijklm9:PI=5>?@1# ABF)TLJUV[‡ˆ‰L;¨fp6œ•bM1h ijklm9Y)t[•I†RC7pTLJ ! AMAF Ft;Luv[b,t;SpTLY69‹‰< 92%R••¨Œ•މL†Rp¢,M1hijklm9 :;<AMAF Ft;I†Rbno[gIR;ÈIJ eô[,AMAF Rw.>•52Rnåño…Lðñb ¦Y’;Y6[53%[gTLJgKžŒÙC·C¢Sp TL“”R‰<,¼½6¶žôCgKž’¨SpTL, ¼½6€CØSpTL,AMAF R6nåño…F^¢9 Ï•sIHICgTL†RS-CdÈ¢_IJ

cdef

1) –—e˜: =5>?@1#AB™#5øl.$UV 9š›Fœ†‰L~œ•,º»••,Vol.49,pp.686-693, 2008. 2) !žŸ , ˆ¡•¢: =5>?@1#AB9ÚI# 5øl.$©ª,£13 •-./0123452¤.i¥ ¦h0, pp.1-8, 2008. 3) •§¨©, ª“e«, ¬-®¯, &'(): TDMC(°)95±¨—˜™€6|²,£ 13 •-./01 23452¤.i¥¦h0, pp.73-79, 2008.

4) B. Brügmann: Monte-Carlo Go. http://ideanest.com/vegos/MonteCarloGo.pdf 5) S. Gelly, D. Silver: Combining online and offline knowledge in uct, In ICML ’07: Proceedings of the 24th international conference on Machine learning, pp 273-280, New york, NY, USA, 2007. ACM.

6) R. Coulom: Computing elo ratings of move patterns in the Game of Go,In Computer Games Workshop, Amsterdam / The Netherlands, 2007.

7) L. Kocsis, C. Szepesvári: Bandit Based Monte-Carlo Planning,Proceedings of the 15th European Conference on Machine Learning (ECML), pp.282-293, 2006.

参照

関連したドキュメント

Ser7 is the value of an American option computed using a 100,000 path Monte Carlo simulation taking 7 terms in series (1.3) as the exercise boundary.. LUBA is the LUBA

Keywords: generalized Fokker – Planck; deterministic method; radiotherapy; particle transport; Boltzmann equation; Monte Carlo.. AMS Subject Classification: 35Q20; 35Q84; 65C05;

In general, SDEs under regime-switching have no explicit solutions so the Monte Carlo simulations have become one of the powerful techniques in valuation of financial quantities,

T´oth, A generalization of Pillai’s arithmetical function involving regular convolutions, Proceedings of the 13th Czech and Slovak International Conference on Number Theory

In Section 3 we study the current time correlations for stationary lattice gases and in Section 4 we report on Monte-Carlo simulations of the TASEP in support of our

Taking care of all above mentioned dates we want to create a discrete model of the evolution in time of the forest.. We denote by x 0 1 , x 0 2 and x 0 3 the initial number of

Toshihiro Shirakawa and Ryuhei Uehara Common Developments of Three Different Orthogonal Boxes, The 24th Canadian Conference on Computational Geometry CCCG 2012, pp... The bible of

Amount of Remuneration, etc. The Company does not pay to Directors who concurrently serve as Executive Officer the remuneration paid to Directors. Therefore, “Number of Persons”