Solving Amazons subgames using AND/OR Tree-Search

3.8 Other classical improvements

4.1.2 Solving Amazons subgames using AND/OR Tree-Search

4.1 Amazons subgames

AND/OR Tree-Search

AND/OR trees are a speciﬁc kind of trees in which leaf nodes are two-valued, 0-false or 1-true, and internal nodes are either OR nodes or AND nodes. Following the same rules as in traditional logic, an OR node is true (has value 1) if at least one of its children is true, and an AND node is false (has value 0) if at least one of its children is false. An AND/OR tree can be used to represent a usual game-tree. In this case, leaf nodes with the value 1 are usually nodes corresponding to a win, while those with value 0 correspond to a loss. As for internal nodes, AND nodes and OR nodes are alternated in the same fashion as minimax with the root node being an OR node. Draws, if they exist, must be handled either as true nodes in which case we will search for strategies giving at least a draw, or as false nodes in which case we will search for strategies giving only a win.

Several algorithms exist to solve AND/OR game-trees, that is to ﬁnd the value of the root [64]. Among them, the most famous is without a doubt Proof-Number Search. We will present it in the next section as well as two other of its versions, DFPN and WPNS.

On a terminology note, proving a node consists in showing that its value is true (1) while disproving it consists in showing that its value is false (0).

Proof-Number Search

Proof-Number Search (abbreviated as PNS) was developed in 1994 [8] by Alliset al.. The main idea of PNS is to make use of Conspiracy Numbers [57], in the present case proof numbers (pn) and disproof numbers (dn), to search ﬁrst nodes whose value is easier to prove or disprove. The search in PNS is conducted until the value of the root - true or false - is found by iteratively expanding a most proving node,i.e. one of the easiest node to prove or disprove.

The proof-number (resp. disproof number) of a node N is deﬁned as the minimum number of leaf nodes which value have to be determined to be able to prove (resp. dis-prove) N. The following rules are used to deﬁnes these numbers:

• For a terminal leaf node, the proof number (resp. disproof number) is equal to 0 (resp. ∞) if the node is a win for the ﬁrst player, and to∞ (resp. 0) if the node is a loss.

• For a non-terminal leaf node, both proof and disproof numbers are set to 1.

• For an internal node, the proof and disproof numbers pn and dn are given by the Formulas 4.1 and 4.2.

For an OR node: pn(N) = min

c=child(N)pn(c) dn(N) =

c=child(N)

dn(N) (4.1)

For an AND node: pn(N) =

c=child(N)

pn(c) dn(N) = min

c=child(N)dn(N) (4.2) Simply put, for an OR node, we need to prove only one of its child to prove the node, so its proof number should be equal to the minimum proof number of the child nodes. On the other hand, to disproove an OR node it is necessary to disproove all of this child, thus the sum. By replacing disproof numbers by proof numbers and vice-versa, the same can be said of AND nodes.

Depth-First Proof Number Search

DFPN is a varianr of PNS developed in 2002 [62] using the same theory and algorithm.

The ﬁrst motivation of Nagai when he proposed DFPN, and even PDS (Proof-Disproof Search) before that [61], was to combine the best of both worlds: Proof numbers and their derivatives leading to a quick search, and best ﬁrst having fewer memory requirements [64].

It notably adds two threshold numbers for each node during the exploration of the tree to facilitate the search of a most proving node. DFPN is known to have solved some of the longest Shogi problem created (like ”Micro Cosmos” [62]), thus accrediting its performances.

Weak Proof Number Search

WPNS is another variant of PNS recently developed [80] whose performances have been shown slightly better than those of PNS. It has the main advantage of better handling the presence of Directed Acyclic Graphs (DAG) in the explored tree. Due to the presence of a sum in Formula 4.1, potential transposition nodes are counted twice in the calculus of proof numbers for AND nodes, leading to over-estimations of these values. For this reason, in the original algorithm we just replace Formula 4.2 used to calculate proof numbers for an AND node with this one:

pn(N) = max

c=childN ode(N)pn(c) +|N.unsolvedChilds| −1 (4.3) This formula simply replaces the value of the children’s proof number by 1 except for the biggest one. Also, the formula calculating disproof numbers for OR nodes is replaced in the same way.

This formula does not anymore represent exactly the number of nodes that have to be proved in order to prove one node, but all in all only ordering of the move is important, not the correct values of the pn and dn for each node.

The two variations of DFPN and WPNS being independent of each other, it is obvi-ously possible to adapt both these methods to create a Depth-First version of WPNS.

Using AND/OR solving techniques for Amazons subgames

The goal of two-player subgames being multi-valued and not two-valued, traditional and/or search methods such as PNS cannot be used as-is. Instead, we must modify their search process as described in [8]: given a result to prove (e.g. white wins by 5 points), we tag leaf nodes as true if their value is equal or superior to this result, and false otherwise. The search for the best result is then an iterative process: we try ﬁrst to prove the minimum possible result (in Amazons, a loss for the ﬁrst player by a number of points equal to the number of empty squares in the subgame), then increase it until we can disprove the result to try. The value of the subgame is then the latest result that the search was able to prove. This obviously requires us to be able to order and enumerate the diﬀerent possible results, which is fortunately the case for all score based games similar to the game of the Amazons.

Speciﬁcally for the game of the Amazons, several improvements can be made to speed up the search. This consists mostly in tagging true or false internal nodes of the search

tree, either because the result is already proved or because it will be impossible to prove it. Considering a position where we want to prove that a player P can win by N points, points, we propose the following rules:

• After player P passes, the node can be tagged as false.

• If player P has been able to play N moves after a pass of his opponent, the node can be tagged as true.

• If player P is to move while his opponent did not yet pass and there are strictly less thanN empty - or reachable - squares on the board, the node can be tagged as false.

ドキュメント内 Monte-Carlo Techniques: Applications to the Game of the Amazons (ページ 63-66)