Analysis using FFNN with a Sliding Window

5.1 Analysis of Naturalness

5.1.4 Analysis using FFNN with a Sliding Window

results in a zig-zag shaped handwriting (Fig. 5.8). Plots for other writer are available in appendix A.3.

Residuals for each writer were also computed. Fig.5.7 plots the residu-als for the FFNN setup for writer wr = 1. It is easy to observe that the residual plots for both the FFNN and the MLRA setups exhibit similar pat-terns; the parts where residuals change quickly follow the parts where the residuals change substantially slower. This observation indicates that models possessing a short-term memory might be appropriate for modeling the data.

Table 5.3: FFNN: Mean square errors and correlation coefficients for the x and y components of the naturalness.

train

wr mse_x,y C_x,y

#1 3.02×10⁻² 2.01×10⁻² 13.71% 17.94%

#2 2.50×10⁻² 1.63×10⁻² 28.35% 13.78%

#3 5.40×10⁻² 3.60×10⁻² 22.43% 18.68%

#4 3.31×10⁻² 2.41×10⁻² 24.94% 24.64%

#5 2.81×10⁻² 2.58×10⁻² 20.98% 14.59%

200 400 600 800 1000 1200 1400 1600

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6

update steps, n = 1, ..., 1680

Naturalness (X component)

teacher model

200 400 600 800 1000 1200 1400 1600

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

update steps, n = 1, ..., 1680

Naturalness (Y component)

teacher model

Figure 5.6: FFNN: Teacher signals vs. signals estimated by the FFNN model.

the last several points of the path from where a fontCstroke is coming have a substantial influence on the naturalness and thus also on where the handC stroke is going to continue (assumption I). We propose that the way a person writes the first half of a handwritten stroke also influences, to a certain ex-tent, how the second part is going to look; i.e. a distortion, which appears in one part of a stroke will usually imply a distortion elsewhere in a successive part of the stroke. It is therefore reasonable to assume that a short sequence of points in a handwritten stroke will influence where a subsequent point will appear (assumption II).

To confirm the above assumptions, an analysis using FFNN with sliding window was performed. Three setups were tested in total.

0 200 400 600 800 1000 1200 1400 1600

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6

update steps, n = 1, 1680

Residuals (X component)

0 200 400 600 800 1000 1200 1400 1600

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5 0.6

update steps, n = 1, 1680

Residuals (Y component)

Figure 5.7: FFNN: Residuals for the x (the upper plot) and y (the lower plot) components of naturalness for writer wr = 1.

Figure 5.8: FFNN: Test trial for writerwr = 1.

A sliding window applied to the inputs was tested first. This setup is re-ferred to as FFNN_H^U. The input matricesU_ij and corresponding naturalness matrices V_ij were created as in the MLRA setup. Instead of teacher-forcing

a single input vector U_ij(k) to a single output vectorV_ij(k), a history of the input vectors

H^U_ij(k) = [U_ij(k),U_ij(k−1),U_ij(k−2), ...,U_ij(k−l_U)] (5.2) was teacher-forced to a single output vector V_ij(k). In the beginning of a stroke where previous values are not available, zeros were used instead.

Merging all history vectors H^U_ij(k) together results in the input matrix H^U with 2∗ (l_U + 1) columns. The output matrices V for each writer wr = 1,2, ...,5 are identical to the MLRA setup.

After several trials, the length l_U of the sliding window was set to a value of 15 what in turn increased the number of inputs to 32 and rendered the input matrix H^U to be of size 1680×32. The number of the inputs is substantially higher than in the previous setups, and thus, the number of nodes in the first hidden layer was increased to 45 and the number of nodes in the second hidden layer to 35. The number of epochs was also increased to 2000 ⁴. All other parameters were identical to the FFNN setup.

Table 5.4 shows the results for each writer. As can be seen, the mean square errors decreased only slightly but the correlation coefficients increased substantially. This result indicates that the history of the input vectors plays an important role in modeling of naturalness and is in line withassumption I that the last several points of the path, indicating from where afontCstroke is coming, have a substantial influence on the naturalness.

Fig. 5.9 plots the teacher signals vs. the signals estimated by the FFNN_HU

model for writer wr = 1. Fig. 5.10 plots the residuals. Fig. 5.11 visualizes the performance and shows that although introducing the history of inputs was found to increase the performance numerically, it was found to cause

4The same configuration tested with the FFNN setup, without sliding window, pro-duced results similar to those of Table 5.3.

Table 5.4: FFNN_H^U: Mean square errors and correlation coefficients for the x and y components of naturalness.

train

wr mse_x,y C_x,y

#1 1.69×10⁻² 1.35×10⁻² 65.98% 58.10%

#2 1.71×10⁻² 1.31×10⁻² 60.46% 45.09%

#3 2.72×10⁻² 2.37×10⁻² 71.67% 59.62%

#4 1.99×10⁻² 1.62×10⁻² 64.91% 60.32%

#5 1.81×10⁻² 1.67×10⁻² 61.22% 58.29%

no increase in visual performance. Plots for other writers can be found in appendix A.4.

200 400 600 800 1000 1200 1400 1600

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6

update steps, n = 1, ..., 1680

Naturalness (X component)

teacher model

200 400 600 800 1000 1200 1400 1600

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

update steps, n = 1, ..., 1680

Naturalness (Y component)

teacher model

Figure 5.9: FFNN_H^U: Teacher signals vs. signals estimated by the FFNN_H^U model for writer wr= 1.

200 400 600 800 1000 1200 1400 1600

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6

update steps, n = 1, ..., 1680

Residuals (X component)

200 400 600 800 1000 1200 1400 1600

−0.5

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

update steps, n = 1, ..., 1680

Residuals (Y component)

Figure 5.10: FFNN_H^U: Residuals for the x (the upper plot) and y (the lower plot) components of naturalness for writer wr = 1.

Figure 5.11: FFNN_H^U: Test trial for writer wr = 1.

A sliding window applied to the history of outputs was tested next and

is referred to as FFNN_H^V. The history of the output vectors

H^V_ij(k) = [V_ij(k−1),V_ij(k−2), ...,V_ij(k−l_V)] (5.3) was teacher-forced to a single output vector Vij(k). At the start of a stroke where previous values are not available zeros were used instead. Merging all history vectors H^V_ij(k) together results in the input matrix H^V with 2∗l_V columns. Here the input matrix H^V is unique for every writer because it is made from the history of outputs, which are unique naturalness vectors. The output matrices V for each writer wr = 1,2, ...,5 are again the same as in the MLRA setup.

After several trials, the length l_V of the sliding window was set to a value of 10 what in turn increased the number of inputs to 20 and the input matrices H^V to a size 1680×20. The network setup is identical to the setup FFNN_HU. This time, however, the network uses a history of its own outputs as inputs (nonlinear autoregressive model - NARX), so that the output is fed back as shown in Fig.5.12 (left). The true output is available during the training, and thus, we can create a series-parallel architecture (Fig.5.12, right), in which the true output is used instead of the estimated one. After the series-parallel training, a test where estimated output is fed back in a parallel architecture has to be carried out to examine the stability and true performance of the model.

The top part of the Table 5.5 shows the results for the training in series-parallel mode. As can be seen, the network was capable of generating the next output from its history of previous outputs with high precision. Compared to all of the previous results, themse_x,ydecreased by an order of magnitude and the correlation coefficients C_x,y increased substantially. This result indicates that the history of the previous outputs has substantial relevance to the output generated in the next step and is in line with assumption II.

Figure 5.12: Parallel and series-parallel architecture, where SW = sliding window and NN = neural network. Exogenous input activation U(n) is optional.

After training, the network was tested in parallel mode to see if it was capable of creating correct outputs on its own. The trained network was started using a zero input history vector of size 1 ×20. Even with a zero input vector, a non-zero output was produced, and this was used in turn used as an input. While the network demonstrated it was capable of settling down into the desired dynamics during training, the bottom part of Table 5.5 shows the results from testing. It can be seen here that the performance of the network decreased sharply, and that the trained network when run on its own could not settle down into the desired dynamics.

Fig. 5.13 plots the teacher signals vs. the signals estimated by the FFNN_HV model in the testing trial for writer wr = 1. Fig. 5.14 plots the residuals. Naturalness signals generated by the FFNN_H^V model in the testing trial were found to be similar to those of the MLRA model; i.e. the natural-ness signals take on values close to zero resulting in generated handwriting (Fig. 5.15) being almost identical to the original font shapes (Fig. 5.1. Plots for other writers are available in appendix A.5.

This poor performance may reflect an inability of the FFNN to model attractor-like behavior so that even if the network is excited by an incorrect input it will eventually, after long enough time, settle into desired dynamics.

No influence from previous internal states in FFNN models makes

attractor-Table 5.5: FFNN_H^V: Mean square errors and correlation coefficients for the x and y components of naturalness.

train

wr mse_x,y C_x,y

#1 6.88×10⁻³ 3.45×10⁻³ 87.68% 91.16%

#2 3.49×10⁻³ 2.47×10⁻³ 93.32% 91.91%

#3 7.24×10⁻³ 4.04×10⁻³ 93.30% 94.35%

#4 4.82×10⁻³ 3.07×10⁻³ 92.73% 93.74%

#5 4.92×10⁻³ 3.02×10⁻³ 91.10% 93.86%

test

mse_x,y C_x,y

#1 3.39×10⁻² 2.09×10⁻² 11.27% 10.72%

#2 3.14×10⁻² 2.07×10⁻² 14.63% 5.74%

#3 5.89×10⁻² 4.17×10⁻² 0.32% 0.55%

#4 4.05×10⁻² 2.55×10⁻² 3.00% 3.87%

#5 2.99×10⁻² 2.46×10⁻² 9.22% 18.89%

like behavior difficult to achieve.

In the final FFNN test, a sliding window was applied to both inputs and outputs FFNN_H^{U V}. The history of the input and output vectors

H^{U V}_ij (k) = [H^U_ij(k),H^V_ij(k)] (5.4) was teacher-forced to a single output vector V_ij(k). As before, at the start of a stroke where previous values are not available, zeros were used instead.

Merging all history vectorsH^{U V}_ij (k) together results in the input matrixH^{U V} with 2∗(l_U + 1) + 2∗l_V columns. The output matrices V for each writer wr = 1,2, ...,5 were again unchanged from the MLRA setup. The values of l_U and l_V were left unchanged (l_U = 15,l_V = 10), and thus, the total number of inputs was 32 + 20 = 52, resulting in an input matrixH^{U V} of size 1680×52. The network setup is identical to that of FFNN_H^U. The training was carried out in series-parallel mode with the testing being performed in

200 400 600 800 1000 1200 1400 1600

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6

update steps, n = 1, ..., 1680

Naturalness (X component)

teacher model

200 400 600 800 1000 1200 1400 1600

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4 0.5

update steps, n = 1, ..., 1680

Naturalness (Y component)

teacher model

Figure 5.13: FFNN_H^V: Teacher signals vs. model estimated signals for writer wr = 1.

parallel mode.

Table 5.6 shows the results for the training and the testing. The mse_x,y for training was found to be about the same as in the FFNN_H^V setup. The correlation coefficient C_x,y was, however, slightly lower. This result indicates that the histories of both inputs and outputs have high relevance to the output generated in the next step. For the testing stage, the msex,y was found to be about the same as in the FFNN_HV setup but C_x,y increased.

This higher C_x,y stems from the availability of the input history at each iteration. Similar to FFNN_H^V, the training trial substantially outperformed the testing trial. This drop in performance can be most likely attributed to

200 400 600 800 1000 1200 1400 1600

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8

update steps, n = 1, ..., 1680

Residuals (X component)

200 400 600 800 1000 1200 1400 1600

−0.5

−0.4

−0.3

−0.2

−0.1 0 0.1 0.2 0.3 0.4

update steps, n = 1, ..., 1680

Residuals (Y component)

Figure 5.14: FFNN_H^V: Residuals for the x (the upper plot) and y (the lower plot) components of naturalness for writer wr = 1.

Figure 5.15: FFNN_H^V: Test trial for writer wr = 1.

the inability of FFNNs to model attractor-like behavior.

Fig. 5.16 plots the teacher signals vs. the signals estimated by the

Table 5.6: FFNN_H^{U V}: Mean square errors and correlation coefficients for the x and y components of naturalness.

train

wr mse_x,y C_x,y

#1 6.48×10⁻³ 4.85×10⁻³ 88.49% 87.15%

#2 6.61×10⁻³ 4.43×10⁻³ 86.97% 84.94%

#3 8.08×10⁻³ 7.85×10⁻³ 92.50% 88.67%

#4 6.92×10⁻³ 5.11×10⁻³ 89.43% 89.36%

#5 6.94×10⁻³ 5.61×10⁻³ 87.18% 88.20%

test

mse_x,y C_x,y

#1 2.64×10⁻² 2.09×10⁻² 54.54% 42.99%

#2 2.42×10⁻² 1.80×10⁻² 44.24% 35.33%

#3 3.98×10⁻² 4.35×10⁻² 59.23% 26.96%

#4 2.81×10⁻² 2.29×10⁻² 50.45% 53.79%

#5 4.89×10⁻² 4.80×10⁻² 36.21% 20.90%

FFNN_H^{U V} model in the testing stage for writerwr = 1. Residuals are shown in Fig. 5.17 and generated handwriting in Fig. 5.18. Plots for other writers can be found in appendix A.6.

It is evident from these tests that the histories of both the inputs and the outputs are vital to performance. Numerically, the FFNN_H^U setup per-formed the best because it did not use feedback from the estimated output

5. The performance is visualized in Fig.5.11. The history of the outputs seems to be particularly important and caused substantial increase in per-formance for the training trials in the setup FFNN_H^V and FFNN_H^{U V}. The testing trials for these two setups, however, demonstrated that the FFNN was incapable of generating appropriate outputs when running on its own.

The performance is visualized in Fig. 5.15 and Fig. 5.18, respectively. While

5For systems with no feedback from the estimated output, training trial is equal to testing trial

200 400 600 800 1000 1200 1400 1600

−1

−0.5 0 0.5 1 1.5

update steps, n = 1, ..., 1680

Naturalness (X component)

teacher model

200 400 600 800 1000 1200 1400 1600

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6

update steps, n = 1, ..., 1680

Naturalness (Y component)

teacher model

Figure 5.16: FFNN_H^{U V}: Teacher signals vs. signals estimated by the FFNN_H^{U V} model for writerwr = 1.

several factors may be involved, the inability to model attractor-like behav-ior is probably the most important one. Because the internal states of an FFNN model are determined only by its inputs, and previous internal states lack influence, attractor-like behavior is difficult to achieve; in our next test a system, which is able to incorporate the impact of previous internal states is therefore investigated.

ドキュメント内 Naturalness learning and its application to thesynthesis of handwritten characters (ページ 55-67)