• 検索結果がありません。

タブーサーチを内包したモンテカルロ木探索に基づく囲碁アルゴリズム

N/A
N/A
Protected

Academic year: 2021

シェア "タブーサーチを内包したモンテカルロ木探索に基づく囲碁アルゴリズム"

Copied!
4
0
0

読み込み中.... (全文を見る)

全文

(1)

λϒʔαʔνΛ಺แͨ͠ϞϯςΧϧϩ໦୳ࡧʹجͮ͘

ғޟΞϧΰϦζϜ

ଠా ༤େ

1

ҏ౻ խ

2 ֓ཁɿϞϯςΧϧϩ໦୳ࡧʹ͓͚ΔϓϨΠΞ΢τͷޮ཰Խͷݚڀ͸׆ൃʹߦΘΕ͖ͯͨ. ͔͠͠ɼϓϨΠ Ξ΢τͷଟ༷ੑʹ͍ͭͯͷݚڀ͸͋·Γ͞Ε͍ͯͳ͍. ͦ͜ͰɼຊݚڀͰ͸ϞμϯώϡʔϦεςΟΫεͷ ҰͭͰ͋ΔλϒʔαʔνΛϓϨΠΞ΢τʹద༻͢Δ͜ͱΛఏҊ͢Δ. ϓϨΠΞ΢τΛߦͬͨہ໘Λλϒʔ Ϧετʹ௥Ճ͠ɼλϒʔظؒ୳ࡧ͢ΔͷΛېࢭ͢Δ. ·ͨɼλϒʔظؒΛա͗ͨہ໘ΛλϒʔϦετ͔Β औΓআ͘. ͦΕʹΑΓϓϨΠΞ΢τͷଟ༷ੑΛ֬อ͢Δ͜ͱ͕Ͱ͖Δ. ਺஋࣮ݧΛରہٴͼ٧ޟʹͯߦ͍ɼ λϒʔαʔνΛ಺แͨ͠ϞϯςΧϧϩ໦୳ࡧ͸୯७ͳϞϯςΧϧϩ໦୳ࡧʹൺ΂ͯྑ͍ੑೳ͕ಘΒΕͨ.

An Igo Algorithm of Monte Carlo Tree Search

Including Tabu Search

Takehiro Ohta

1

Masaru Itoh

2

Abstract: Efficiency of playout in Monte Carlo tree search (MCTS) have been extensively studied up to now. However, diversity of playout is not really investigated. Because of that, this paper refers to the diver-sity in MCTS. So this paper proposes to combine MCTS with tabu search (TA), which is a modern heuristic technique for combinatorial problems, into the computer igo algorithm. Once a phase of the playout is added into a tabu list, the searching method prohibits the adoption of the same phase during a given tabu tenure. When the number of trials for playouts is greater than the tabu tenure, the phase is removed from the tabu list. And then the phase could be adopted again. Thus the proposed method can be obtained to ensure the diversity of playout as a whole. The numerical results for some life-and-death igo problems shows that the method of MCTS including TA have obviously got an advantage over the simple MCTS algorithm with the view of the right moves.

1.

͸͡Ίʹ

ۙ೥ͷίϯϐϡʔλғޟͷعྗ͸ɼ͜͜਺೥࣮֬ʹ޲

্͍ͯ͠Δ. 2013೥3݄ʹߦΘΕͨୈ̒ճUECഋίϯ

ϐϡʔλғޟେձ[1]Ͱ͸ɼCrazy Stone[2]͕༏উͨ͠.

ίϯϐϡʔλғޟͷعྗ͕޲্ͨ͠ཧ༝ͱͯ͠ɼϞϯ ςΧϧϩ໦୳ࡧ(MCTS: Monte Carlo Tree Search)[3]΍

ϓϨΠΞ΢τͷޮ཰ԽͳͲ͕ڍ͛ΒΕΔ. ؔ࿈ݚڀͱͯ͠

LGRF(Last Good Reply with Forget)[4]΍ύλʔϯʹΑ

1 Ѫ஌޻ۀେֶେֶӃܦӦ৘ใՊֶݚڀՊ

Graduate School of Business Administration and Computer Science, Aichi Institute of Technology

2 Ѫ஌޻ۀେֶ৘ใՊֶ෦৘ใՊֶՊ

Faculty of Information Science, Aichi Institute of Technology

ΔϓϨΠΞ΢τͷணख[5]ͳͲ͕͋Δ. લऀ͸ϓϨΠΞ΢τ தͷ͋Δہ໘Ͱউͬͨଧͪํ͸هԱ͓͖ͯ͠ɼෛ͚ͨଧͪ ํ͸๨ΕΔͱ͍͏ख๏Ͱ͋Δ. ͜ΕʹΑΓಉ͡ہ໘ʹૺ۰ ͨ͠৔߹ɼউͬͨ࣌ͷखΛଧͭ͜ͱͰϓϨΠΞ΢τΛޮ཰ Խ͢Δ. ޙऀ͸ύλʔϯΛϓϨΠΞ΢τͰͷީิखͱ͢Δ ͜ͱͰϓϨΠΞ΢τͷਫ਼౓Λ޲্ͤ͞Δ. ͜ͷΑ͏ʹϓϨ ΠΞ΢τͷޮ཰Խʹ͍ͭͯ͸׆ൃʹݚڀ͞Ε͍ͯΔ͕ɼϓ ϨΠΞ΢τͷଟ༷ੑʹ͍ͭͯ͸͋·Γݚڀ͞Ε͍ͯͳ͍. ͦ͜ͰɼຊݚڀͰ͸ϞμϯώϡʔϦεςΟΫεͷҰͭͰ ͋Δλϒʔαʔν[6]ΛϓϨΠΞ΢τʹద༻͢Δ͜ͱΛఏ Ҋ͢Δ. λϒʔαʔν͸ɼҰ౓୳ࡧΛߦͬͨղΛλϒʔϦ ετʹ௥Ճ͠ɼλϒʔظؒத͸୳ࡧ͠ͳ͍ͱ͍͏ΞϧΰϦ ζϜͰ͋Δ. ಉҰہ໘ͷ८ճΛ๷͗ͭͭɼ୳ࡧۭؒͷଟ༷

The 18th Game Programming Workshop 2013

(2)

-ੑΛ֬อ͢Δ͜ͱ͕Ͱ͖Δ. ୯७ͳϞϯςΧϧϩ໦୳ࡧͱλϒʔαʔνΛ಺แͨ͠Ϟ ϯςΧϧϩ໦୳ࡧͱͷੑೳൺֱΛରہ͓Αͼ٧ޟʹͯߦͬ ͨ. ·ͨɼϓϨΠΞ΢τʹ͓͚ΔಉҰہ໘Λ୳ࡧͨ͠ൺ཰ ʹ͍ͭͯ΋ߟ࡯͢Δ.

2.

ϞϯςΧϧϩ໦୳ࡧ

ϞϯςΧϧϩ໦୳ࡧ͸ɼϞϯςΧϧϩ๏ʹUCB1Ξϧΰ

ϦζϜ(Upper Confidence Bound)[7]ͱ໦୳ࡧΛ༻͍ͨΞ

ϧΰϦζϜͰ͋Δ. ୯७ͳϞϯςΧϧϩ๏͸ɼ໌Β͔ʹѱ ͍खʹ΋ϓϨΠΞ΢τΛۉ౳ʹ࣮ߦ͠ɼฏۉউ཰Ͱखͷྑ ͠ѱ͠Λ൑அ͢Δ. ͔͠͠ɼUCB1ΞϧΰϦζϜΛ࢖༻͢ Ε͹ɼউ཰͕ߴ͍खʹରͯ͠ϓϨΠΞ΢τΛΑΓଟ͘ࢼߦ Ͱ͖ɼ࠷େ(૬खʹͱͬͯ͸࠷খ)ͷՁ஋Λ࣋ͭख͕બ୒Ͱ ͖ΔΑ͏ʹͳΔ. UCB1ΞϧΰϦζϜͷܭࢉࣜΛࣜ(1)ʹࣔ͢. ͜͜Ͱɼ ¯ Xi͸खiΛଧͬͨ࣌ͷݱࡏͷউ཰ɼni͸खiҎ߱ͷϓϨΠ Ξ΢τͷճ਺ɼN͸ͦͷہ໘ʹ͓͍ͯࢪߦͨ͠ϓϨΠΞ΢ τͷ૯਺(niͷ૯࿨)Ͱ͋Δ. UCB1(i) = ¯Xi+  2 logN ni (1) ϞϯςΧϧϩ໦୳ࡧͷॲཧͷྲྀΕΛҎԼʹࣔ͢. STEP 1 ϧʔτϊʔυ͔Βࣜ(1)ͷ஋͕࠷େͰ͋Δࢠ ϊʔυΛબ୒͠ͳ͕Β຤୺ϊʔυ·Ͱ໦ΛԼΓΔ. STEP 2຤୺ϊʔυͷϓϨΠΞ΢τճ਺͕ᮢ஋Λ௒͑ͯ ͍Ε͹ɼͦͷϊʔυͷࢠϊʔυΛ࡞੒͠໦ΛԼΓΔ. ᮢ஋Λ௒͍͑ͯͳ͚Ε͹ɼࢠϊʔυ͸࡞੒͠ͳ͍. STEP 3຤୺ϊʔυͰϓϨΠΞ΢τΛ1ճߦ͏. STEP 4୧͖ͬͯͨϊʔυʢϧʔτϊʔυΛؚΉʣ͢΂ ͯʹϓϨΠΞ΢τͷ݁ՌΛ൓өͤ͞ɼউ཰Λߋ৽͢Δ. STEP 5੍ݶ࣌ؒ౳ͷ੍໿ʹୡ͍ͯ͠Ε͹୳ࡧΛऴྃ͢ Δ.͞΋ͳ͚Ε͹STEP 1ʹ໭Δ.

3.

ఏҊ๏ͷ֓ཁ

ғޟͰ͸ɼਤ1ͷΑ͏ʹணखखॱ͕ҟͳ͍ͬͯͯ΋ಉ͡ ہ໘ʹͳΔ͜ͱ͕ଟ͍. ಛʹɼ9࿏൫΍٧ޟͰ͸ணखൣғ͕ ڱ͍ͷͰݦஶʹදΕΔ. ಉҰہ໘ͷ८ճʹΑΓɼہॴղʹ ऩଋ͢ΔՄೳੑ͕ੜ͡Δ. ಉҰہ໘ͷ८ճ͕ى͜Γқ͍ہ ໘ͱͯ͠γνϣ΢͕ڍ͛ΒΕΔ. ྫ͑͹ɼγνϣ΢͕ਖ਼ண Ͱ͋Δہ໘Ͱγνϣ΢Λूதతʹ୳ࡧ͢Δͷ͸ਖ਼͍͕͠ɼ γνϣ΢͕ਖ਼ணͰͳ͍ہ໘Ͱγνϣ΢Λूதతʹ୳ࡧ͢Δ ͷ͸ޮՌతͰ͸ͳ͍. ͦ͜ͰɼຊݚڀͰ͸ϞμϯώϡʔϦεςΟΫεͷҰͭͰ ͋ΔλϒʔαʔνΛϞϯςΧϧϩ໦୳ࡧʹద༻͢Δ. 3.1 ఏҊ๏ͷΞϧΰϦζϜ λϒʔαʔν͸ϞϯςΧϧϩ໦୳ࡧͷSTEP 3ͷϓϨΠ Ξ΢τதͷہ໘ʹରͯ͠ߦ͏. Ұ౓୳ࡧͨ͠ϓϨΠΞ΢τ ͷہ໘ΛλϒʔϦετʹ௥Ճ͓͖ͯ͠ɼλϒʔظؒத͸୳ ࡧ͠ͳ͍Α͏ʹ͢Δ. λϒʔαʔνΛ಺แ͢Δ͜ͱͰɼಉ Ұہ໘ͷ८ճΛ๷͗ͭͭɼϓϨΠΞ΢τͷଟ༷ੑ͕֬อͰ ͖Δ. λϒʔϦετ͸ਤ2ʹࣔ͢Α͏ʹϞϯςΧϧϩ໦ͷ ֤຤୺ϊʔυ͕อ࣋͢Δ. ਤதͷϊʔυn1ͱϊʔυn2ͷ λϒʔϦετͷཁૉ͕ҟͳΔΑ͏ʹɼλϒʔϦετ͸֤ ϊʔυຖʹ؅ཧ͠ɼϊʔυؒͰͷλϒʔϦετͷڞ༗͸ߦ Θͳ͍. λϒʔϦετ͸ϓϨΠΞ΢τॳख༻ͱ࣍ख༻ͷ2 ͭͷϦετͰߏ੒͞ΕɼϦετͷߏ଄͸Ωϡʔߏ଄ͱ͠ɼ Ϧετͷ௕͞ΛλϒʔαΠζͱ͢Δ. ਤ2͸λϒʔαΠζ ͕3Ͱ͋ΔϦετΛද͓ͯ͠ΓɼਤதͷFirst͸ॳख༻ͷ ϦετΛɼSecond͸࣍ख༻ͷϦετΛҙຯ͢Δ. ·ͨɼਤ 2தͷ໼ҹ͸λϒʔϦετΛߋ৽͢ΔࡍͷΩϡʔΛͣΒ͢ ํ޲Λද͍ͯ͠Δ. લઅϞϯςΧϧϩ໦୳ࡧSTEP3΁ͷ௥ՃΞϧΰϦζϜ ͷ֓ཁΛҎԼʹࣔ͢. STEP 3-1ϓϨΠΞ΢τ1ख໨ͱ2ख໨ͷ࣌ɼͦͷہ໘ ͕λϒʔϦετʹؚ·Ε͍ͯΔ͔൱͔Λௐ΂Δ. STEP 3-2ؚ·Ε͍ͯͳ͚Ε͹ɼͦͷہ໘ΛλϒʔϦε τʹ௥Ճ͠ɼϓϨΠΞ΢τΛଓߦ͢Δ.ؚ·Ε͍ͯΔ ͳΒ͹ɼͦͷख͸ഁغ͞ΕɼϓϨΠΞ΢τΛ1ख໨͔ Β΍Γ௚͠ɼSTEP 3-1ʹ໭Δ. STEP 3-3λϒʔϦετʹہ໘Λ௥Ճ͢Δͱಉ࣌ʹɼλ ϒʔϦετΛߋ৽͠ɼλϒʔظؒΛա͗ͨہ໘Λλ ϒʔϦετ͔ΒऔΓআ͘. λϒʔαʔνΛಋೖ͢Δ͜ͱͰɼ୯७ͳMCTSख๏ʹ ൺ΂ͯϓϨΠΞ΢τʹཁ͢Δ͕࣌ؒ૿େ͢Δͱਪ࡯͞ΕΔ. ͔͠͠ɼ࣮ࡍʹ͸୯७ͳMCTSख๏ͱେࠩͷͳ͍ܭࢉ࣌ ؒͰ͋ͬͨ. ࣍અͰఏҊ๏ͷܭࢉྔʹ͍ͭͯઆ໌͢Δ. 1 3 2 1 3 2

1 Example of the same state obtained from different moves

First Second Tabu Size = 3 (5, 3) (4, 5) (7, 1) … n1 n2 First Second Tabu Size = 3 (4, 7) (3, 2) (8, 4) Null Null Null n1 n2 (2, 1) (1, 8) (3, 4)

2 Tabu lists on each node in the tree

The 18th Game Programming Workshop 2013

(3)

-(5, 3) (7, 1) (1, 8) (3, 4) Null (2, 1) (4, 5) (2, 1) Tabu Size = 3 (5, 3) (7, 1) (1, 8) (3, 4) First Second

3 Updated queue of tabu lists

3.2 ఏҊ๏ͷܭࢉྔ λϒʔϦετͷαΠζΛLͱ͠ɼN࿏൫ͰୈKख໨ͷ ہ໘Λߟ͑Δ. ͜ͷͱ͖ϓϨΠΞ΢τ͕1ճ੒ޭ͢Δ·Ͱ ʹ࣮ߦ͢ΔϓϨΠΞ΢τͷฏۉࢼߦճ਺ΛT ͱ͠ɼT ͷ্ ք஋T¯ΛٻΊͯΈΔ. ҎԼͷ3ͭΛԾఆ͢Δ. (1)λϒʔ Ϧετ͸શ෦ຒ·͍ͬͯΔ. (2)λϒʔϦετʹొ࿥͢Δ ہ໘͸ॳखͱ࣍खͷ2ख·Ͱͱ͢Δ. (3)બ୒͞Εͨख͕ λϒʔϦετʹؚ·ΕΔ৔߹ɼͦͷख͸ഁغ͞Εɼλϒʔ ϦετͷΩϡʔ΋ಉ࣌ʹߋ৽͞ΕΔ. Ծఆ(3)ͷঢ়گΛਤ 3ͷࠨਤʹࣔ͢. ϓϨΠΞ΢τॳखͷ࠲ඪ͕(5, 3)ͳΒ͹ɼ ॳखͷϦετʹ(5, 3)ؚ͕·Ε͍ͯΔͷͰɼख(5, 3)͸ഁ غ͞ΕΔ. ͦͷࡍɼॳखͷϦετ͚͕ͩ1ͭͣΕͯɼਤ3 ͷӈਤͷΑ͏ʹͳΔ. ͯ͞ɼ൫໘ʹ͸ੴ͕(K − 1)ݸ͋ΔͷͰɼୈKख໨͕ λϒʔʹͳΔ֬཰͸L/(N2− (K − 1))Ͱ͋Γɼ࣍ख͕λ ϒʔʹͳΔ֬཰͸L/(N2− K)Ͱ͋Δ. ฏۉࢼߦճ਺Λ্ ք஋ͰධՁ͢Ε͹ɼ֬཰P ΛP = L/(N2− K)ͱ͓͍ͯ Σ(ࣦഊ͢Δճ਺)× (ͦͷ֬཰) + 1Λܭࢉ͢Ε͹Α͍.  n=1 n(n − 1)(1 − P )Pn−1+Pn+ 1 (2) ࣜ(2)ୈ1߲͸λϒʔαʔνΛಋೖͨ͜͠ͱʹΑΓແޮ ͱͳΔϓϨΠΞ΢τͷฏۉࢼߦճ਺ͷ্ք஋Ͱ͋Δ. ϓϨ ΠΞ΢τ͕༗ޮʹͳΔʹ͸͞Βʹ΋͏1ճࢼߦΛ௥Ճ͠ͳ ͚Ε͹ͳΒͳ͍. ͦΕ͕ࣜ(2)ୈ2߲ͷ+ 1ͷҙຯͰ͋Δ. ࣜ(2)Λܭࢉ͢Δͱɼ3P/(1 − P )2+ 1ΛಘΔ. ্ք஋ͱ͍ ͏ҙຯ߹͍͔Βɼ͜ΕΛ4P/(1 − P )2+ 1ͰධՁ͢Ε͹ɼ ݁ہT¯͸ ¯ T = N2− K + L N2− K − L 2 (3) ͱͯ͠ಘΒΕΔ. ࣜ(3)ΑΓϞϯςΧϧϩ໦୳ࡧʹλϒʔαʔνΛ಺แ͞ ͤͯ΋ແޮʹͳΔϓϨΠΞ΢τ਺͕ۃ୺ʹ૿͑Δ͜ͱ͸ͳ ͍ͱ͍͑Δ.

4.

਺஋࣮ݧ

୯७ͳMCTSख๏ͱλϒʔαʔνΛ಺แͨ͠Tabu + MCTSख๏Λରہٴͼ٧ޟʹͯੑೳൺֱΛߦͬͨ. ·ͨɼ ಉҰہ໘ͷ୳ࡧൺ཰ʹ͍ͭͯ΋࣮ݧΛߦͬͨ. ֤छύϥ ϝʔλ͸ɼλϒʔϦετͷαΠζΛ3ʙ12ɼλϒʔظؒΛ 3ʙ12ɼλϒʔϦετʹ௥Ճ͢Δہ໘ΛϓϨΠΞ΢τ2ख

1 Results of the games

݁Ռ ରہ਺ উͪ਺ ෛ͚਺ উ཰ Tabu3 200ہ 138উ 62ഊ 69.0 % Tabu4 200ہ 148উ 52ഊ 74.0 % Tabu5 200ہ 154উ 46ഊ 77.0 % Tabu6 200ہ 162উ 38ഊ 81.0 % Tabu7 200ہ 152উ 48ഊ 76.0 % Tabu8 200ہ 162উ 38ഊ 81.0 % Tabu9 200ہ 155উ 45ഊ 77.5 % Tabu10 200ہ 155উ 45ഊ 77.5 % Tabu11 200ہ 157উ 43ഊ 78.5 % Tabu12 200ہ 160উ 40ഊ 80.0 %

2 Results of binomial test

݁Ռ p஋ ৴པ۠ؒ Tabu3 8.027e−8 0.621 - 0.753 Tabu4 7.261e−12 0.673 - 0.799 Tabu5 1.017e−9 0.703 - 0.865 Tabu6 < 2.200e−16 0.749 - 0.862 Tabu7 8.738e−14 0.695 - 0.817 Tabu8 < 2.200e−16 0.749 - 0.862 Tabu9 2.397e−15 0.711 - 0.831 Tabu10 2.397e−15 0.711 - 0.831 Tabu11 < 2.200e−16 0.722 - 0.840 Tabu12 < 2.200e−16 0.738 - 0.853 Ҏ಺ɼϓϨΠΞ΢τͷᮢ஋Λ100ͱͨ͠. 4.1 ରہʹΑΔੑೳൺֱ ରہ݁ՌΛද1ʹࣔ͢. ද1ͷTabu3ʙ12͸λϒʔαΠ ζ͕3ʙ12Ͱ͋ΔTabu + MCTSख๏Λද͍ͯ͠Δ. Tabu + MCTSख๏Λࠇ൪ɼ୯७ͳMCTSख๏Λന൪ͱ͠ɼί ϛΛ6໨൒ɼ࣋ͪ࣌ؒΛ֤10෼ɼޟ൫Λ9࿏൫ͱͨ͠. ද1ΑΓɼTabu3ʙTabu12ͷશͯʹ͓͍ͯ୯७ͳMCTS ख๏ʹରͯ͠ߴ͍উ཰͕ಘΒΕͨ. ࣍ʹɼද1ͷରઓ݁Ռ Λجʹ༗ҙਫ४5%Ͱೋ߲ݕఆΛߦͬͨ. ݁ՌΛද2ʹࣔ ͢. ͜͜ͰɼؼແԾઆΛʮ୯७ͳMCTSख๏ͱTabu3ʙ Tabu12ͷTabu + MCTSख๏ʹعྗͷࠩ͸ͳ͍ʯͱ͠ɼ ରཱԾઆΛʮೋͭͷख๏ʹعྗͷ͕ࠩ͋Δʯͱͨ͠. ද2 ΑΓTabu3ʙTabu12ͷશͯʹ͓͍ͯp஋͕0.05ΑΓখ͞ ͍ͷ͸໌Β͔Ͱ͋Δ. ΑͬͯɼؼແԾઆ͸غ٫͞Εɼରཱ Ծઆ͕࠾୒͞ΕΔ. ͭ·ΓɼTabu + MCTSख๏ͱ୯७ͳ MCTSख๏ʹ͸عྗͷ͕ࠩ͋Δͱ͍͑Δ. 4.2 ಉҰہ໘ͷ୳ࡧൺ཰ ୯७ͳMCTSख๏ͱTabu + MCTSख๏͕ɼͦΕͧΕ 9࿏൫ͷࠇ൪1ख໨Λ୳ࡧͨ͠ࡍͷϓϨΠΞ΢τʹ͓͍ͯ ಉҰہ໘Λ୳ࡧͨ͠ൺ཰Λਤ4ʹࣔ͢. ·ͨہ໘ͷର৅͸ɼ ϓϨΠΞ΢τ1ख໨ͱ2ख໨ͷہ໘ͱͨ͠. ਤ4ΑΓɼTabu + MCTSख๏ͷํ͕୯७ͳMCTSख ๏ʹൺ΂ͯಉҰہ໘Λ୳ࡧͨ͠ൺ཰͕௿͍͜ͱ͕෼͔Δ.

The 18th Game Programming Workshop 2013

(4)

-10 15 20 25

MCTS Tabu4 Tabu6 Tabu8 Tabu10 Tabu12

S ear ch ing r at ios on t h e s am e s tat e ( % )

first move in playout second move in playout

4 Searching ratios on the same state in playouts

·ͨɼλϒʔαΠζ͕େ͖͘ͳΔʹैͬͯɼಉҰہ໘Λ୳ ࡧͨ͠ൺ཰͕ݮগ͍ͯ͠Δ. ۩ମతʹɼϓϨΠΞ΢τ1ख ໨ͰMCTS͸21.5%ʹର͠Tabu11Ͱ14.6%ɼϓϨΠΞ΢ τ2ख໨ͰMCTS͸17.2%ʹର͠Tabu11Ͱ12.4%ʹݮ গ͍ͯ͠Δ. ͜Ε͸ɼఏҊ͢ΔTabu + MCTSख๏͕୯७ ͳMCTSख๏ʹൺ΂ͯϓϨΠΞ΢τͷଟ༷ੑΛ֬อͯ͠ ͍Δ͜ͱΛҙຯ͢Δ. 4.3 ٧ޟʹΑΔੑೳൺֱ ٧ޟ10໰Λ୯७ͳMCTSख๏ͱTabu + MCTSख๏ ʹͦΕͧΕ30ճͣͭղ౴ͤͨ͞. ͦΕͧΕͷख๏ͷਖ਼౴ ཰Λਤ5ʹࣔ͢. ͜͜Ͱɼਖ਼౴཰Λ٧ޟͷ1ख໨͕ਖ਼ணͰ ׂ͋ͬͨ߹ͱ͢Δ. ਤ5ͷMCTSͱ͸୯७ͳMCTSख๏ ΛɼTabu12͸λϒʔαΠζ12ͷTabu + MCTSख๏Λ ද͍ͯ͠Δ. ਤ5ΑΓMCTSʹൺ΂ͯTabu12ͷํ͕٧ޟਖ਼౴཰͕ ߴ͍͜ͱ͕෼͔Δ. ಉ༷ʹTabu3ʙ11ʹରͯ͠ಉ࣮͡ݧΛ ߦͬͨ݁ՌɼTabu12ʹ͍ۙਖ਼౴཰͕ಘΒΕͨ. ࣍ʹɼ୯७ ͳMCTSख๏ͱTabu3ʙ12ͷTabu + MCTSख๏ͷ٧ޟ ਖ਼౴཰ͷ݁ՌΛجʹ༗ҙਫ४5%Ͱχ2ݕఆΛߦͬͨ. ͜͜ ͰɼؼແԾઆΛʮMCTSͱTabu3ʙ12ʹ༗ҙͳࠩ͸ͳ͍ʯ ͱ͠ɼରཱԾઆΛʮMCTSͱTabu3ʙ12ʹ༗ҙͳ͕ࠩ͋ Δʯͱͨ͠. χ2ݕఆͷ݁ՌΛද3ʹࣔ͢. ද3ΑΓ༗ҙਫ ४5%ͷχ2஋Ͱ͋Δ3.841Λେ෯ʹ௒͍͑ͯΔ. Αͬͯɼ ؼແԾઆ͸غ٫͞ΕɼରཱԾઆ͕࠾୒͞ΕΔ. ͭ·Γɼ୯ ७ͳMCTSख๏ͱTabu3ʙ12ͷTabu + MCTSख๏ʹ͸ ༗ҙͳ͕ࠩ͋Δͱ͍͑Δ.

5.

͓ΘΓʹ

ຊߘͰ͸λϒʔαʔνΛ಺แͨ͠MCTSख๏ΛఏҊ͠ ͨ. ਺஋࣮ݧʹΑΓɼରہٴͼ٧ޟʹ͓͍ͯఏҊͨ͠Tabu + MCTSख๏ͷํ͕୯७ͳMCTSख๏ʹൺ΂ͯੑೳ͕ྑ ͍͜ͱΛ֬ೝͨ͠. ·ͨɼTabu + MCTSख๏͸ϓϨΠΞ ΢τͷଟ༷ੑΛ֬อͰ͖Δ͜ͱ΋֬ೝͨ͠. ࠓޙͷ՝୊ͱͯ͠ɼࣗݾରઓʹΑΔউ཰ͷ෼ੳ͚ͩͰͳ ၥ㢟1 ၥ㢟2 ၥ㢟3 ၥ㢟4 ၥ㢟5 ၥ㢟6 ၥ㢟7 ၥ㢟8 ၥ㢟9 ၥ㢟10 0 20 40 60 80 100 ၥ㢟1 ၥ㢟2 ၥ㢟3 ၥ㢟4 ၥ㢟5 ၥ㢟6 ၥ㢟7 ၥ㢟8 ၥ㢟9 ၥ㢟10 0 20 40 60 80 100 MCTSTabu12

5 Ratios of the correct answers3 Results of chi-squared test

݁Ռ χ2 p Tabu3 98.783 < 2.200e−16 Tabu4 100.109 < 2.200e−16 Tabu5 90.009 1.622e−15 Tabu6 99.114 < 2.200e−16 Tabu7 95.081 < 2.200e−16 Tabu8 107.867 < 2.200e−16 Tabu9 103.609 < 2.200e−16 Tabu10 83.569 3.149e−14 Tabu11 98.841 < 2.200e−16 Tabu12 96.830 < 2.200e−16 ͘ΦʔϓϯιʔεͱͷରઓʹΑΔੑೳൺֱ͕ඞཁͰ͋Δ. ঘɼຊݚڀͷҰ෦͸ɼจ෦Պֶল Պݚඅ ج൫ݚڀ(C) No.25330441 ͷॿ੒Λಘͨ. ͜͜ʹँҙΛද͢Δ. ࢀߟจݙ [1] ୈ6ճUECഋίϯϐϡʔλғޟେձ, http://jsb.cs.uec. ac.jp/˜igo/

[2] Crazy Stone, http://remi.coulom.free.fr/CrazyStone/ [3] R´emi Coulom, “Computing Elo Ratings of Move

Pat-terns in the Game of Go”, Computer Games Workshop, 2007.

[4] Hendrik Baier, and Peter D. Drake, “The Power of Forgetting: Improving the Last-Good-Reply Policy in Monte Carlo Go”, IEEE Transaction on Computational Intelligence and AI in Games, Vol.2, No.4, pp.303–309, 2010.

[5] Slyvain Gelly, Yizao Wang, R´emi Munosɼand Olivier Teytaudɼ“Modification of UCT with Patterns in Monte-Carlo Go”, Technical Report RR-6062, INRIA, 2006. [6] ໺ʑ෦޺࢘,༄Ӝກݑ, “ہॴ୳ࡧ๏ͱͦͷ֦ு—λϒʔ

୳ࡧ๏Λத৺ͱͯ͠”,ܭଌͱ੍ޚ, Vol.47, No.6, pp.493– 499, 2008.

[7] Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer, “Finite-time Analysis of the Multiarmed Bandit Prob-lem”, Machine Learning, Vol.47, No.2–3, pp.235–256, 2002.

The 18th Game Programming Workshop 2013

参照

関連したドキュメント

She reviews the status of a number of interrelated problems on diameters of graphs, including: (i) degree/diameter problem, (ii) order/degree problem, (iii) given n, D, D 0 ,

It is suggested by our method that most of the quadratic algebras for all St¨ ackel equivalence classes of 3D second order quantum superintegrable systems on conformally flat

Reynolds, “Sharp conditions for boundedness in linear discrete Volterra equations,” Journal of Difference Equations and Applications, vol.. Kolmanovskii, “Asymptotic properties of

Kilbas; Conditions of the existence of a classical solution of a Cauchy type problem for the diffusion equation with the Riemann-Liouville partial derivative, Differential Equations,

It turns out that the symbol which is defined in a probabilistic way coincides with the analytic (in the sense of pseudo-differential operators) symbol for the class of Feller

We give a Dehn–Nielsen type theorem for the homology cobordism group of homol- ogy cylinders by considering its action on the acyclic closure, which was defined by Levine in [12]

Applying the representation theory of the supergroupGL(m | n) and the supergroup analogue of Schur-Weyl Duality it becomes straightforward to calculate the combinatorial effect

To derive a weak formulation of (1.1)–(1.8), we first assume that the functions v, p, θ and c are a classical solution of our problem. 33]) and substitute the Neumann boundary