• 検索結果がありません。

Chapter 4 Plastic Identification using SVM Technique and PCA- PCA-SVM Integration Techniques PCA-SVM Integration Techniques

4.2. Plastic Identification using PCA-SVM for Industrial Field

data of two peaks become weak and it will obtain kind information appropriately. The inference in this section is C-H bending and C=N stretch the most optimal to use as information source for plastic identification by Raman spectra.

From the new dataset, SVM technique is clear to identify and distinguish the plastic samples into each class with 100% accuracy that displays in Fig. 4.12. The maximum margin between support vector to be largest in which PS is 2.39, PP-ABS is 2.19, and PS-PP-ABS is 0.28. The total number of support vector decrease to 7 with the details are PP = 2, PS = 2, and ABS = 3. The amount of support vector indicating that the dispersion of data in feature space to be better and the construction of optimal hyperplane is more stable due to the number of support vector is spread evenly.

Figure 4.10. The percentages contribution of correlated information to principal components.

To evaluate the robustness of PCA-SVM classification model, adding noise intensity into Raman spectra is a manner to simulate the condition of plastic recycling industry which have distortions possibility from others device when the plastic identification processing is running on the sorting stage. In Fig. 4.13 shows the plotting of data noisy with 0.75 times bigger of normal noise signal intensity into PCA-SVM

model with accuracy value is 0.99 or 99%. In the figure is seen 1 of 20 ABS sample which identify in PP area, it is mean the level of accuracy ABS is 95% when adding noise 0.75 times, while PP and PS remain 100%.

Figure 4.11. The percentages contribution to Principal Component, a) The Contribution to PC1, b) The contribution to PC2.

Figure 4.12. The Plotting of PCA-SVM Classification using original data.

a) b)

The next plotting is data with noise 1.5 times that displayed in Fig. 4.14. The overall accuracy is 98% while accuracy of PP is 100% and ABS is 100%. There are two samples of PS that identify in half area between PS and ABS yet remain counted as amiss classification, so the accuracy of PS is 94%. If noticed of ABS plotting before there is one sample that wrong classification nevertheless in the next ABS become clear classification, it caused generating of noise signal is randomized ever process so as in this case ABS accuracy become 100%. The explanation will discuss deeply later.

Figure 4.13. The Plotting of PCA-SVM Classification with data that adding noise 0.75 times.

The data that adding noise 2.5 times possess accuracy 98% similar with level noise before yet the accuracy each class is different, where PS there is one mistake identify in PP area, so the accuracy is 97% and ABS also have one amiss recognize in ABS area. The information of them can be seen in Fig. 4.15.

In Figure 4.16 represent of plotting data with 3.75 times adding noise in which the wrong identify of ABS reach four samples that classified as PS (accuracy 81%), while PS that incorrect classification is one sample and detected as PP, so the accuracy of PS is 97%, and PP samples remain in 100% accuracy even though the dissemination of sample grow up. Overall of the accuracy for this noise level is 96%.

Figure 4.14. The Plotting of PCA-SVM Classification with data that adding noise 1.5 times.

In this noise level the accuracy of identification leaves accuracy expectation in level 95%. Fig. 4.17 shows the plotting of data training with 7.5 times of noise have accuracy decrease sharply to level 75%. The mistake identification of sample occurs in all class. ABS appear weakest because it is identified into class PP and PS, in details the ABS which recognized as PP are 3 samples and as PS are 7 samples, so the accuracy become 48%. The incorrect classification of PP is similar condition with PS, in which

the faulty classification of PP only occurs as PS class are 2 samples with 97% accuracy, while PS is detected as ABS class is 4 samples and the accuracy become 88%.

Figure 4.15. The Plotting of PCA-SVM Classification with data that adding noise 2.5 times.

Figure 4.16. The Plotting of PCA-SVM Classification with data that adding noise 3.75 times.

The last data testing with noise is level 15 times of normal noise signal intensity which is the severe conditions for PCA-SVM classification model, it show in Fig. 4.18. In all class data arise incorrect identification even there are some sample data which is plotting out of feature space. The PP samples which identified as ABS are 3 samples and as PS also 3 samples, while that plot out of space are 15 samples. The ABS which detected as PP a number 4 sample and as PS are 3 samples while PS that recognize as PP not more 5 samples and as ABS equal 3 samples. The accuracy drops in around 69%.

Figure 4.17. The Plotting of PCA-SVM Classification with data that adding noise 7.5 times.

Due to the noise intensity was triggered by randomize value which enable to get different form of the noise signal, then we undertake 5 times measurement to know the possibility of randomizing which make a different toward accuracy. In Fig. 4.19

shows the resonance of accuracy decrease slowly in level noise 0.75 – 3.75 times yet go down progressively in level 7.5 – 15 times.

Comparation between PCA-SVM with SVM without PCA is a manner to evaluate the impact of integration PCA into SVM technique. In Fig 4.20 displays the divergences average accuracy of 5 times measurements between PCA-SVM and SVM.

Data training for SVM measurements uses whole dimension of dataset and similar with data training in peaks pair combination, while data testing also applying adding noise with amplification that equal such as in integration technique. In the figure appear that the accuracy of SVM always more less than PCA-SVM, even start from noise amplification in level 1.5 times the accuracy begins drop and different greatly.

Figure 4.18. The plotting of PCA-SVM classification with data that adding noise 15 times.

Summary

In this section we want to explore deeply the advantages of PCA-SVM integration for plastic identification in industrial field. In general, the discussion will divide into two sub-section, a) the benefit of compounding PCA and SVM as the solid technique, and b) to measure the robustness of PCA-SVM classification model by adding noise.

Figure 4.19. The accuracy of 5 times experimental on PCA-SVM technique.

a) PCA-SVM integration

In the previous section have discussed about the principle to determine performance of SVM technique which consist of three criteria. Two of three criteria have improved by PCA-SVM integration technique while one of them have achieved maximum condition. First, the integration technique extends the margin between

support vector significantly, four to six-fold. In combination of P1-P4 the distance between PP-PS is 0.388 previously, up to 2.39, PP-ABS from 0.377 to 2.19, and PS-ABS before 0.077 become 0.28. Expansion of margin provide commodious space so identification process more adaptive for material condition such as dirty, unclean or contaminated with microorganism.

Figure 4.20. Comparation of accuracy between PCA-SVM and SVM.

Second is the number degradation and balancing amount of support vector indicate that inter-correlated quantitative feature which correspond to a linear combination of P1, P2, P3, and P4 is running well in PCA technique and present nifty contribution toward performance of SVM training data to establish optimum hyperplane. SVM linear classification require one or two support vectors in each class to build linear hyperplane. If the number three or more in each pair class, it predicts as

a nonlinear classification form and it will affect to SVM training performance to achieve optimum hyperplane with maximum margin. In this case the number of support vector in PP = 3, PS = 3, and ABS = 4, due to the circumstance consist of three class which categorized as multiclass classification, then the number of support vector is conformable with SVM linear classification rule.

The significant improvement of maximum margin and the number of support vector due to collecting important information to principal component. In Fig. 4.10 displays the contribution information 72% collected in PC1 and 23% in PC2. Matrix calculation of PCA that successful rank importance of linear combination values from aromatic and aliphatic plastics sample into PC1 and PC2 as big as 95%. Corresponding with discussion before that C-H bending more relevant to inform the aromatic and aliphatic of sample, yet benzene ring and C-C stretching also indicate the relevancy although not as good as C-H bending which if collected in PC1 will increase the maximum margin between PP-PS and PP-ABS such as showed in Fig. 4.11. In the same figures explained that C=N stretching dominate contribution toward PC2 that cause distance between PS and ABS increasingly. In addition, the high presentation of contribution against PC1 and PC2 conform that peaks extraction technique in data preparation is effective to keep correlation each feature in dataset.

Even though implementation of PCA yield significant changing toward distance of each class, yet identification of unknown material still became limitations in this study. It is caused by several reason: First, the purpose of the study is to increase purity in industrial sorting, so basic of peaks selection was based by laboratory analysis

concerning chemical bond of samples. Second, the probability of unknown material will come from many sources such as plastic composition diversity of the plastic product, microorganism contamination, and mixing with others material such as iron, wood, soil or paper. To categories the unknown material will produce many classes classification. Third, composition of sample for data training in machine learning should be balance and represented.

All the reasons above reveal the disadvantage of machine learning techniques.

However, our approach can overcome the limitations. Quantity evaluation of peak selection approach aims to make adaptive identification against material changing conditions, where the evaluation able to inspect appropriate peaks of whole peaks on Raman spectra corresponding with purity requirement, PCA will be able to enlarge margin among support vector, and adding noise will be overview possibility of others type of material or environmental condition. So that, feature space restriction can be made to force identification of the unknown material are outside of feature space. This approach will categories unknown material as a class regardless of the type.

b) Adding noise

Examination of nifty performance of PCA-SVM integration model which possess two significant changing of three criteria in the principle is by adding noise into data testing, which is a manner to get testing data extraordinary. It is selected because in industrial field often found signal waveform from electronic devices or others source such as heat and whirr which can be disturbing another device especially for Raman device which may affect to fluorescence or Rayleigh scattering and to

predict possibility of unknown material. By the evaluation is expected was obtained information of the robust model that have accuracy level more than 95% to discriminate the plastic type so will get high purity of shorting stage in plastic recycling industry.

To discussing the result will begin with information about limitation noise figure (NF) of Raman fiber-optical from others research previous which have converted the parameter into decibel (dB), explains that the maximum limit of NF around 3 dB [47,48]. It is mean that when the NF over limit certainly will not give normal accuracy which are desired, even cannot find any information.

In Table 4.4 displays the accuracy in each level noise amplification which has validated using linear regression technique, in case is root mean square error (RMSE).

In addition, the accuracy has compared with accuracy from SVM technique without PCA and has validated also. The information describes that level of the amplification which appropriate to the desired industrial standard are in level 0.75 (99.5%), 1.5 (99.1%), and 2.5 (96.9%) while the others level not qualified due to the accuracy of classification model underneath 95%.

Table 4.4. Validation Accuracy of PCA-SVM and SVM.

Nu Noise Amplification Level

PCA-SVM SVM

Accuracy RMSE Accuracy RMSE

1 0.75 99.5 0.03 98.2 0

2 1.5 99.1 0.02 95.4 0

3 2.5 96.9 0.04 89.4 0.01

4 3.75 93.6 0.12 76.2 0.08

5 7.5 77.9 0.36 54.2 0.13

6 15 60.3 0.55 42.8 0.11

As the noise amplification converted into dB, then the minimum amplification equal 3 dB and the maximum are similar 18 dB such as that showed in Table 4.5.

Referring to discussion above in which just emphasize on three level of noise amplification which according with industry exactness requirements, then it can be said that the PCA-SVM technique is evaluated using data training with noise as big as 3, 4 and 8 dB. Accordingly, the result can be explained that the classification model will discriminate plastic type in noise threshold condition of Raman fiber-optical with accuracy 99.5% even still work well appropriate industry requirement around three-fold of the threshold.

Table 4.5. Conversion the noise amplification into dB.

Noise Amplification

0.75 1.5 2.5 3.75 7.5 15

dB 3 4 8 11 17 18

In accordance with the purity requirement of recycling industry, the limit of accuracy is not under 95%. Therefore, threshold of adding artificial noise has qualified until 2.5-times of original artificial noise intensity to evaluate the robustness of identification process from internal and external disruption. Determination of noise threshold is determined by limit of requirement accuracy itself and threshold condition of Raman fiber-optical that are far beyond of NF value.

関連したドキュメント