2.2 Left-corner Parsing
2.2.5 Another variant
We now present another variant of the left-corner PDA appeared in the literature (Resnik, 1992;
Johnson, 1998a). We will see that this algorithm has a different property with respect to the stack depth and the degree of center-embedding than Theorem 2.1. In particular, this difference is relevant to the structures that are recognized as center-embedding for the algorithm, which has not been precisely discussed so far; Schuler et al. (2010) give comparison of two algorithms but from a different perspective.
Figure 2.16 shows the list of possible transitions in this variant. The crucial difference between two PDAs is in the form of initial and final stack symbols. That is, in this PDA the initial stack symbol qinitial isS, while qf inal is an empty symbolε, which are opposite in the PDA that we discussed so far (Section 2.2.3).
Also in this variant, the form of stack symbols is different. Instead ofA/B, which represents a subtree waiting forB, it hasA−B, which means that B is theleft-cornerin a subtree rooted at A, and has been already recognized. In other words,Ais the current goal, which the PDA tries to build, whileBrepresents a finished subtree. This is schematically shown in Figure 2.17(a).
Parsing starts withqinitial=S, which immediately changes toS−AwhereA→a, anda∈Σ is the initial token of the sentence. PREDICTION is similar to the one in our variant: It expands the currently recognized structure, and also predicts the sibling symbol (i.e., D), which becomes a new goal symbol. COMPOSITION looks very different, but has the similar sense of transition.
Step Action Stack Read symbol S
1 SHIFT S−A′ a
2 COMPOSITION B
3 SHIFT B−B′ b
4 PREDICT B−C C′
5 SCAN B−C c
6 COMPOSITION D
7 SCAN ε d
Figure 2.18: Parsing process of the PDA in Figure 2.16 to recover the parse in Figure 2.10(b) given the CFG in Figure 2.15 and an input sentencea b c d. The stack depth keeps one in every step after a shift transition.
In the symbolA/B, A is not limited to S, in which caseA is some right descendant of another nonterminal, as depicted in Figure 2.17(b). The sense of COMPOSITIONin Figure 2.16 is that we finish recognition of the left subtree ofA(i.e., the tree rooted atB) and change the goal symbol to C, the sibling ofB. If we consider this transition in the form of Figure 2.17(b), it looks similar to the one in Figure 2.12; that is, the corresponding transition in our variant isX/A B 7−→ε C. Instead, in the current variant, the root nonterminal of a subtreeX is not kept on the stack, and the goal symbol is moved from top to bottom. This is the reason why the final stack symbolqf inalis empty.
The final goal for the PDA is always the preterminal for the last token of the sentence, which is then finally removed by SCAN.
Example This PDA has slightly different characteristics in terms of stack depth and the degree of center-embedding, which we point out here with some examples. In particular, it regards the parse in Figure 2.10(d) as singly (degree one) center-embedded, while the one in Figure 2.10(b) as not center-embedded. That is, it has just the opposite properties to the PDA that we discussed in Section 2.2.3.
We first see how the situation changes for the CFG that we gave an example in Figure 2.14, which analyzed the parse in Figure 2.10(b). See Figure 2.18. Contrary to our variant, this PDA has the property that its stack depth after someshift transitions increases as the degree of center-embedding increases.7 In this case, these are steps 3, 5, and 7, all of which has a stack with only one element. The main reason why it does not increase the stack depth is in the first COMPOSITION
operation, which changes the stack symbol toB. After that, since the outside structure of B is already processed, the remaining tree looks just like left-branching, which the left-corner PDA including this variant processes without increasing the stack depth.
On the other hand, for the parse in Figure 2.10(d), this PDA increases the stack depth as simu-lated in Figure 2.19. At step 2, the PDA introduces new goal symbolC, which remains on the stack
7 This again contrasts with our variant (Theorem 2.1). This is because in the PDA in 2.16, new stack element is introduced with a reduce transition (i.e., PREDICTION), and center-embedding is detected with the followed SHIFT, which does not decrease the stack depth. In our variant, on the other hand, new stack element is introduced by SHIFT. Center-embedding is detected if this new element remains on the stack after a reduce transition (by PREDICTION).
Step Action Stack Read symbol S
1 SHIFT S−A′ a
2 PREDICTION S−B C
3 SHIFT S−B C−B′ b 4 COMPOSITION S−B C′
5 SCAN S−B c
6 COMPOSITION D
7 SCAN ε d
Figure 2.19: Parsing process of the PDA in Figure 2.16 to recover the parse in Figure 2.10(d). The stack depth after a shift transition increases at step 3.
after the followed SHIFT. This is the pattern of transitions with which this PDA increase its stack depth, and it occurs when processing the zig-zag patterns starting from left edges, not right edges as in our variant.
Discussion We have pointed out that there are two variants of (arc-eager) left-corner PDAs, which suffer from slightly different conditions under which their stack depth increases. From an empirical point of view, the only common property is its asymptotic behavior. That is, both linearly increase the stack depth as the degree of center-embedding increases. The difference is rather subtle, i.e., the condition of beginning center-embedding (left edges or right edges).
Historically, the variant introduced in this section (Figure 2.16) has been thought as the realiza-tion of the left-corner PDA (Resnik, 1992; Johnson, 1998a). However, as we have seen, if we base development of the algorithm on the parsing strategy (Section 2.2.2), our variant can be seen as the correct implementation of it, as only our variant preserves the transparent relationship between the stack depth and the disconnected trees generated during enumeration by the strategy.
Resnik (1992) did not design the algorithm based on the parsing strategy, but from an existing arc-standard left-corner PDA (Rosenkrantz and Lewis, 1970; Johnson-Laird, 1983), which also ac-cepts an empty stack symbol as the final configuration. His main argument is that the arc-eager left-corner PDA can be obtained by introducing a COMPOSITIONoperation, which does not exist in the arc-standard PDA. Interestingly, there is another variant of the arc-standard PDA (Nederhof, 1993), which instead accepts theS symbol8. If we extend this algorithm by introducing COMPO
-SITION, we get very similar algorithm to the one we presented in Section 2.2.3 with the same stack depth property.
Thus, we can conclude that Resnik’s argument is correct in that a left-corner PDA can be arc-eagerby adding composition operations, but depending on which arc-standard PDA we employ as the basis, the resulting arc-eager PDA may have different characteristics in terms of stack depth.
In particular, the initial and final stack configurations are important. If the based arc-standard PDA accepts the empty stack symbol as in Rosenkrantz and Lewis (1970), the corresponding arc-eager
8To be precise, the stack item of Nederhof (1993) is a dotted rule like [S→NP•VP] and parsing finishes with an item of the form [S→α•] with someα.
PDA regards the pattern beginning with right edges as center-embedding. The direction becomes opposite if we start from the PDA that accepts the non-empty stack symbol as in Nederhof (1993).
Our discussion in the following chapters is based on the variant we presented in Section 2.2.3, which is relevant to Nederhof (1993). However, we do not make any claims such that this algorithm is superior to the variant we introduced in this section. Both are correct arc-eager left-corner PDAs, and we argue that the choice is rather arbitrary. This arbitrariness is further discussed next, along with the limitation of both approaches as the psycholinguistic models.
Finally, our variant of the PDA in Section 2.2.3 has been previously presented in Schuler et al. (2010) and van Schijndel and Schuler (2013), though they do not mention the relevance of the algorithm to the parsing strategy. Their main concern is in the psychological plausibility of the parsing model, and they argue that this variant is more plausible due to its inherent bottom-up nature (not starting from the predictedSsymbol). They do not point out the difference of two algorithms in terms of the recognized center-embedded structures as we discussed here.