4.4 Experiments, Results and Discussion
4.4.2 Gender-based Classification
in-creased, the possibility of sharing contextual key points becomes higher which results in emergence of larger clusters in the embedding space. This effect could be seen using 2d t-SNE embeddings of the trained representation vectors with different context window sizes in Figure 4.6. The context window size is set to 1 for the embeddings shown in Figure 4.6(a) and 8 for the ones shown in Figure 4.6(b). It is apparent that when the window size is small, representation vectors are spread evenly in small concentrations in contrast to the greater context window size with larger pronounced concentrations. In the end, the best choices for the size of hidden layer and context window are dependent on the application, where there are trade-offs to be made.
−40 −20 0 20 40 1st Component
−40
−20 0 20 40
2nd Component
(a)
−50 0 50
1st Component
−40
−20 0 20 40
2nd Component
(b)
−50 0 50
1st Component
−60
−40
−20 0 20 40 60
2nd Component
(c)
−50 0 50
1st Component
−60
−40
−20 0 20 40 60 80
2nd Component
(d)
Figure 4.6: 2d t-SNE visualizations of the resulted embedding vectors for different model configurations. The numbers proceeding the lettersHandCrepresent the dimensionality of the embedding space and the context window size bidirectionally. Model configurations:
(a) H16C1 (b) H16C8 (c) H32C4 (d) H32C16
and the minimum number of neighbors was set to 10. At each level, the minimum distance was doubled, and clustering was performed on the noise from the previous level. Then each of these clusters was assigned a unique label. The results are shown in Figure 4.7.
As seen in the result, the lower levels capture topographical features. For instance, in level 1 and level 2 key points, the colony’s location and persistent foraging locations are identifiable. It is while level 4 and 5 generally captured transient locations which may have caused by contemporary environmental conditions. Since the method introduced in previous chapter used solely spatial information, here as well, the features were strictly extracted from spatial information for comparison purposes. Based on the distribution of average speed densities shown in Figure 4.8(a), for each trajectory segment containing a key point, mean direction of flight associated with speeds equal or over 2.5 m/s and mean direction of drift associated with speeds less than 2.5 m/s were extracted. These were quantized in 4 directions and a neutral label. Histograms of this feature for female and male birds are shown in Figure 4.8(c) and Figure 4.8(d).
These features designed to encapsulate information about local activities and environ-ment factors in the corresponding key points. For the average sequence length of about 71 extracted from trajectories, the embedding vectors dimension was set to 64. Dictionary size was set to top 500 frequent key points. In the case of recurrent neural network used here, a long-short term memory (LSTM) network [146], with a single layer of 128 cells, was used.
Gender classification result achieved from embedding vectors with LSTM (LSTM-EMBD-SP) was compared with the ones obtained from raw spatial coordinates of key points with LSTM (LSTM-SP). These results are listed in Table 4.1.
There are two sets of results that are posted for embedding vectors. One is LSTM-EMBD-SP-1 which has used vector representations created by skip-gram model and the
(a) (b)
(c) (d)
Figure 4.7: Multi-level DBSCAN clustering of Streaked Shearwater trajectory points.
Level 0 is trajectory points. Level 1, 2, 3, 4 and 5 are detected clusters with neighbor-hood radii set to 1.5, 3, 6, 12, 24 km respectively. Minimum neighbors number set to 10.
Each centroid’s marker size is proportional to the number of trajectories sharing the corre-sponding key point. (a) Trajectory points assigned to the detected clusters for each level.
(b) Centroid points of detected clusters for each level. (c) Sample centroid points of Level 1 clusters. L1 0 is located at colony. (d) Sample centroid points of Level 3 clusters.
(a) (b)
(c) (d)
Figure 4.8: Semantical features extracted for key points are based on speed, time and direc-tion. Each one of these features is discrete and semantical. (a) Dominant mode of activity as either flight mode or floating on water designated as drift mode based on 2.5 m/s speed threshold. (b) Time spans for which birds remained at key points. (c) Discretized directions female birds took at “L3 11” key point in Figure 4.7(d).
Table 4.1: For each method, the mean and standard deviation of validation accuracies, and test accuracy is listed.
Method Acc. Mean (%) Acc. Std.(%) Test Acc (%)*
LSTM-SP 57.11 3.88 57.09
LSTM-EMBD-SP-1 69.85 3.01 68.36
LSTM-EMBD-SP-2 68.31 2.67
-LSTM-EMBD-SPT 73.10 2.22
-SVM-ENTLCSS[37] 63.03 2.44 61.09
* Test accuracies are obtained via CodaLab’s ABC2018 [143] submissions.
other is LSTM-EMBD-SP-2 which has used the ones created by CBOW model. It is seen that there is not a wide performance gap between them. In addition, the performance of classification using entropy and LCSS (SVM-ENTLCSS) [37] was also included. For this method, the highest performing parameters in the classifier were selected. It can be seen that having vector embeddings as input has improved the performance over the other method by about 7% and about 10% over the one using raw spatial coordinates as input.
Again, it should be noted that, only spatial information was utilized for creating features.
Closely examining variance of the validation results and validation data itself, it shows that, certainly, there are trajectories that are short or spatially uninformative. These are probably gender neutral and their key points are common among both genders. This is analogous to neutral sentiment sentences in sentiment analysis of documents. Furthermore, time, calendar and the features extracted from other domains may also be informative in gender prediction of trajectories. Here, an additional experiment performed by including a new feature constructed based on the continuous time span of a trajectory segment assigned to a key point. In other words, it measures the amount of time that a seabird remains at a key point. Figure 4.8(b) demonstrates the distribution of measured time spans in minutes
for all trajectory key points. Two main densities were identified by a cut-off point set at 5 mins. With this new feature, the network was retrained and tested. Results, labelled as LSTM-EMBD-SPT, are listed in Table 1. It is seen that it could achieve about 4% gain over the results obtained from using only spatial features. This illustrates the potential of features from other domains in further improvement of the classification results. However, this is out of the scope of this work and it is suggested as a follow-up study. The main objective here was to evaluate advantages of using embedding in trajectory classification.