CHAPTER 2 RESEARCH METHODOLOGY OF SVM AND ROC CURVE
2.3 RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE
the soft margin constant (C) and any parameter the kernel function may depend on (width of a Gaussian kernel or degree of a polynomial kernel).
The popularity of SVMs has led to the development of several special-purpose solvers for the SVM optimization problem [17]. One of the most common SVM solvers is a library for support vector machine (LIBSVM) [18]. We conducted this experiment by using the classification solver in a LIBSVM with grid search optimal hyper parameters (the soft margin constant C and width of a Gaussian kernel γ).
The dependence of the SVM decision boundary on the SVM hyper parameters translates into a dependence of classier accuracy on the hyper parameters. When working with a linear classifier, the only hyper parameter that needs to be tuned is the SVM soft margin constant. For the Gaussian kernels, the search space is two-dimensional, and the standard method of exploring this space is grid-search, where the grid points are generally chosen on a logarithmic scale and classifier accuracy is estimated for each point on the grid. A classifier is then trained using the hyper parameters that yield the best accuracy on the grid. The grid algorithm is an alternative to finding the best C and γ when using the RBF kernel (Gaussian kernel) function [10].
In this study, we decided to first classify the bridge structure members’ health into two classes "Good & Poor" and "Good & Fair" and then train the model.
predicted as negative. False positive (FP) indicates negative records that are incorrectly predicted as positive, while true negatives (TN) indicates negative records that are correctly predicted as negative.
Table 2.1 Confusion matrix
Predicted Class
Predicted Positive Predicted Negative True
Class
Positive TP (True Positive) FN (False Negative) Negative FP (False Positive) TN (True Negative)
Obtaining the optimal balance classification ability is described effectively in terms of sensitivity (or TP rate or positive class accuracy) and specificity (or TN or negative class accuracy) as follows:
True Positive Rate=Sensitivity = TP
TP+FN (2.9) False Positive Rate=1-Specificity = FP
TN+FP (2.10) More precisely, sensitivity measures the proportion of actual positives that are identified correctly, while specificity measures the proportion of actual negatives that are identified correctly.
In addition, the trade-off between sensitivity and specificity can be represented graphically as an ROC curve. It can be understood as a plot of the probability of correctly classifying the positive examples against the rate of incorrectly classifying the TN examples. The ROC curve can be constructed by plotting these pairs of values on the graph with 1 - specificity on the x-axis and sensitivity on the y-axis.
In fact, when the data are strongly unbalanced, accuracy may be misleading, since the all-positive or all-negative classifier may achieve an exceptionally good classification rate. Situations in which datasets are unbalanced arise frequently in real-world problems; in these cases, model evaluation is performed through other criteria besides accuracy. Metrics extracted from ROC curves, such as the area under ROC curve (AUC), can be suitable alternative for model evaluation, because the ROC curve can determine the difference between errors on positive or negative examples [19].
Let 𝑓: 𝜒 → R be a classier that assigns a real number to an instance x ∊ χ. The instance x may or may not include an event that we would like to detect. If 𝑓(𝑥) > 𝑐, it
means that the classier 𝑓 predicts that x which includes the event. Otherwise, if 𝑓(𝑥) < 𝑐, it means that the classier predicts the instance that does not include the event.
Here, c is a threshold that takes values from -∞ to +∞. If, for example c approaches -∞, that is, 𝑓(𝑥) > 𝑐 always holds, and the classier predicts that any instance x includes the event, there are no miss-detections and several false detections.
Let us denote by 𝑥+ an instance that contains the event and by 𝑥− an instance that does not contain the event. Then TP means that 𝑓(𝑥+) > 𝑐, while FP means that 𝑓(𝑥−) > 𝑐. The ROC curve is a graph of TP rate versus FP rate. The AUC, A, is defined as the following integral [20]:
A = ∫ 𝑃(𝑓(𝑥01 +) > 𝑐)𝑑𝑃(𝑓(𝑥−) > 𝑐) (2.11)
= ∫−∞+∞𝑃(𝑓(𝑥+) > 𝑐)𝑝𝑓(𝑥−)(𝑐)𝑑𝑐 (2.12)
= P(𝑓(𝑥+) > 𝑓(𝑥−)) (2.13) where 𝑝𝑓(𝑥−)(𝑐) is the probability density function of the random variable 𝑓(𝑥−) at point c.
Simundic’s [21] study on diagnostic accuracy shows the shape of the ROC curve, and the AUC helps us estimate how high the discriminative power of a test is. The AUC can have any value between 0 and 1, and is a good indicator of the overall quality of the test.
A perfect diagnostic test has an AUC of 1.0, an excellent test has an AUC of around 0.9 to 1.0, a very good test has an AUC of around 0.8 to 0.9, a good test has an AUC of aro-
Table 2.2 Interpretation of the AUC
Figure 2.2 The ROC curve (refer Simundic et al. 2012)
AUC Diagnostic Accuracy
0.9-1.0 Excellent
0.8-0.9 Very good
0.7-0.8 Good
0.6-0.7 Sufficient
0.5-0.6 Bad
<0.5 Test not useful
und 0.7 to 0.8, a sufficient test has an AUC of around 0.6 to 0.7, a bad test has an AUC of around 0.5 to 0.6, and a useless test has an AUC of 0.5 or less. Generally, the relation between AUC and diagnostic accuracy (as described in Table 2.2) and the ROC curve can be plotted as shown in Figure 2.2 [21]. In our study, the AUC rating is determined according to Table 2.2.
REFERENCES
[1] The Road Committee of the Panel on Infrastructure Development:
Recommendation for Full-scale Maintenance of Aging Roads, April 14, 2014.
https://www.mlit.go.jp/road/road_e/pdf/recommendation.pdf
[2] Masahiro S., Takashi T.: Bridge inspection standards in Japan and US, Proceedings of the 29th US - Japan Bridge Engineering Workshop, Tsukuba, Japan, 2013.
[3] Arong, Murakami S., Hosoe I.: A proposal for weighted optimization method on bridge health integrity through corroded steel bridge infrastructure, Proceeding of 7th International Conference on Bridge Maintenance, Safety and Management (IABMAS 2014), Shanghai, China, pp. 1495-1502, 2014.
[4] Oishi H., Kobayashi H., Yun Y., Tanaka H., Nakayama H. and Furukawa K.:
Evaluation of the danger of slops in view of the effective value of measure constructions using support vector machine (Japanese), Journal of Japan Society of Civil Engineers, Ser. F, Vol. 63, pp. 107-118, 2007.
[5] Chikata Y., Aso K., Kameda J. and Kido T.: Applicability of SVM and LVQ to evaluation of bridge integrity (Japanese), Journal of Applied Computing in Civil Engineering, Vol.16, pp. 175-184, 2007.
[6] Sugimoto H., Ichima K., Abe J. and Furukawa K.: On synthetic health evaluation of infrastructure by SVM and its application to ranking of structures (Japanese), Journal of Japan Society of Civil Engineers, Ser. A, Vol. 65, pp. 658-669, 2009.
[7] Yuki K., Kobayashi H., Ohishi H., Sugimoto H., Iida T. and Furukawa K.:
Evaluation for judgement criteria of repair on civil engineering structure by support vector machine (Japanese), Journal of Japan Society of Civil Engineers, Ser. F4 (Construction and Management) , Vol. 68, pp.52-61, 2012
[8] Cortes C., Vapnik V.: Support-vector networks, Machine Learning, Vol. 20, pp.
273-297, 1995.
[9] Vapnik V.: Statistical Learning Theory, ISBN 0-4761-03003-1, 1998.
[10] Ben-Hur A., Weston J.: A user’s guide to support vector machines, Data Mining Techniques for the Life Sciences, pp. 223-239, 2010.
[11] Schölkopf B., Smola A.: Learning with Kernels, MIT Press, Cambridge, MA, 2002.
[12] Cristianini N., Shawe-Taylor J.: An Introduction to Support Vector Machines, Cambridge UP, Cambridge, UK, ISBN 0-521-78019-5, 2000.
[13] Ruszczyński A.: Nonlinear Optimization, Princeton University Press, ISBN 978-0691119151, 2006.
[14] Akbani R., Kwek S. and Japkowicz N.: Applying support vector machines to imbalanced datasets, Proceedings of the 15th European Conference on Machine Learning, pp. 39-50, 2004.
[15] Krystallenia D., Stelios G., Christos K. and Stella S.: Support Vector Machines Classification on Class Imbalanced Data: A Case Study with Real Medical Data, Journal of Data Science, Vol. 12, pp. 727-754, 2014.
[16] Chapelle O.: Training a support vector machine in the primal, Neural Computation, Vol.19 (5), pp. 1155- 1178, 2007.
[17] Bottou L., Chapelle O., DeCoste D. and Weston J.: Large Scale Kernel Machines, MIT Press, Cambridge, MA, 2007.
[18] Chang C. C., Lin C. J.: LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, Vol.2 (3), pp. 1-27, 2011.
[19] Rakotomamonjy A.: Optimizing area under ROC curve with SVMs, Proceedings of 1st international workshop on ROC Analysis in Artificial Intelligence, Spain, pp.
71–80, 2004.
[20] Bamber D.: The Area above the Ordinal Dominance Graph and the Area below the Receiver Operating Characteristic Graph, Journal of Mathematical Psychology, Vol.
12, pp. 387-415, 1975.
[21] Simundic A.: Diagnostic Accuracy-Part 1: Basic Concepts, Journal of Near-Patient Testing & Technology, Vol.11, pp. 6-8, 2012.