On the effect of variance-stabilizing transformation to wavelet-based software reliability assessment (Mathematical Models of Decision Making under Uncertainty and Ambiguity and Related Topics)

(1)

On

the

eﬀect

of variance-stabilizing

transformation

to wavelet-based

software reliability

assessment

首都大学東京システムデザイン研究科

肖

雷

Xiao Xiao

Graduate School of

System

Design,

Tokyo Metropolitan University

1 Introduction

Insoftware$1^{\cdot}$eliabilityengineering, ithas become

one

of themain issuesin thisareato

assess

thesoftware

relia-bility quantitatively.Among

over

hundreds ofsoftware reliability models(SRMs) [11, 12,14],non-homogeneous

Poissonprocess(NHPP)basedSRMs have gainedmuch popularityinactual software testingphases. People

are

interestedin estimating the

_software

intensity

_function

ofNHPP-based SRMsfromsoftwarefaultcount datathat

can

be collectedinthesoftware testing phases. Thesoftware intensity function indiscretetimedenotes thenumber of softwarefaults detectedperunit time. Fromtheestimated software intensityfunction,it ispossible to predict

the number of remaining software faults andthe quantitative software reliability, which is defined

as

the

proba-bility that softwaresystemdoesnotfail during

a

specified time periodunder

a

specified operationalenvironment.

Therefore,

we

are

interested indevelopingahigh-accuracyestimation methodforthe software intensity function.

We proposed waveletshrinkageestimation(WSE)

as

non-parametricestimationmethod for NHPP-based SRMs

([15, 16, 17 The WSE does notrequiresolvinganyoptimizationproblem,

so

that the implementation of

es-timationalgorithms is rathereasy than the othernon-parametric methods. We compared

our

method with the

conventional maximumlikelihoodestimation(MLE)andtheleastsquares estimation(LSE)through

goodness-of-fit test. Ithas beenshown that WSEcould providehighergoodness-of-fitperformance than MLE and LSB inmany

cases,inspite ofits non-parametricnature,through numerical experimentswithreal softwarefault count data. Thefundamental idea of WSEistoremovethe noise includedintheobserved software faultcountdatatoget

a

noise-freeestimateofthesoftware intensity function. Itisperformed through the following three-step procedure.

First,the noisevariance is stabilizedby applying the variance-stabilizingtransformationtothedata. This produces

a

time-series data inwhichthenoise

can

betreated

as

Gaussianwhitenoise. Second,the noise is removed using

threshold methods. Third,

an

inverse variance-stabilizing transformation is applied tothe denoised time-series

data,obtaining theestimateofthe softwareintensityfunction. Thispaperfocusesonthefirst andthe third steps of WSE andaimsatidentifyingandemphasizingtheinfluence that thedatatransformation exertsontheaccuracy

of WSE. The remaining part of thispaperis planmed

as

follows. InSection2,wegiveapreliminaryonwavelet analysis, especially focusing on multiresolution representation. Section 3 describes the WSE for NHPP-based

SRMs in details. InSection 4,wecarryout the realproject dataanalysisand examinethe eﬀectivenessof eight

(2)

2 Multiresolution Representation

Let$f(t\rangle$denote the targetfunctiononcontinuous time$t$

.

Inmultiresolution representation, targetfunctionis

usuallyexpressed bythe linear combinationofitsapproximation component anddetailcomponent, i.e.,

$f(t) = f_{jc!}(t)+ \sum_{j=j(}^{+\infty}g_{j}(t)$, (1)

where$f_{j)}(t)$and$g_{j}(t)$arethe level-joapproximationcomponentand level-j detail component of$f(t)$,respectively.

Here,parameter$j(\geq 0)$is called resolution _level,_and$jo$ isprimary resolution level. The levei-japproximation

component$f_{j}(t)$andlevel-jdetail$g_{j}(t)$component

are

defined

as

follows.

$f_{j}(t) \sum_{k=-\infty}^{+\infty}a_{jf}\phi_{JJ}(t\rangle,$ (2)

$g_{j}(t) = \sum_{k=-\infty}^{+\infty}\beta_{j,k}\psi_{ノ^{}a,k}(t\rangle$

.

(3)

Here,$\alpha_{j\lambda}$and$\beta_{j.k}$

are

the so-called scaling

coeﬃcients

and waveletcoeﬃcients,respectively.$\phi_{j,k}(t)$and$\psi_{J^{1k}},(t)$

are

waveletbases that

are

generatedfrom

_father

waveletand motherwavelet,respectively. Ifwe choose Haarwavelet,

ofwhich the father wavelet and mother wavelet

are

given by

$\phi(t)$ $=$ $\{\begin{array}{ll}1 (0\leq t\leq 1)0 (otherwise),\end{array}$ $\langle$4)

and

$\psi(t)$ $=$ $\{\begin{array}{ll}1 (0\leq t<1/2)-f (1/2\leq t<1) ,0 (otherwise)\end{array}$ $\langle 5\rangle$

then$\phi_{j,k}(t)$and$\psi_{j,k}(t)$canbegenerated byintroducingascaling parameter$j$andashift parameter$k$

as

$\phi_{J^{k}}(t\rangle$ $=$ $2^{j/2}\phi(2^{j}t-k)=\{\begin{array}{ll}2^{j/2} (2^{-j}k\leq t\preceq 2^{-j}(k+1))0 (otherwise),\end{array}$ (6)

and

$\psi_{j_{i}}k(t\rangle$ $=$ $2^{J/2}\psi(2^{j}t-k\rangle=\{\begin{array}{ll}2^{j/2} (2^{-j}k\leq t<2^{-j}(k+1/2)\rangle-2^{j/2} (2^{-j}(k+1/2)\leq t<2^{-j}(k+1))0 (otherwise).\end{array}$

Because$\phi_{j,k}(l)$and$\psi_{ノ,k}(t)$

are

orthonormalbases,their coeﬃcients

can

befoundby calculating theinnerproduct

ofthetargetfunction and themselves.Therefore,the scaling coeﬃcients andwaveletcoeﬃcients

are

defined

as

$\alpha_{jJ} = \int_{\infty}^{\infty}f(t)\phi_{ノ,k}^{*}(t)dt$, (7)

$\beta_{i^{k}}, = \int_{\infty}^{\infty}f(t)\psi_{j,k}^{*}(t)dt$, (8)

wherenotation $*$

means

complexconjugation.

In practice,thehighest resolution level$J$dependsonthe length ofobserved data, say$n(=2^{J})$

.

_{Additionally,}

(3)

givenby

$f_{J}(t \rangle = f_{0}(t)+\sum_{j=\mathfrak{o}}^{J-1}g_{j}(t)$, (9)

$f_{j}(t)= \sum_{\succ-0}^{2^{j}-1}\alpha_{jJ}\phi_{j,k}(t)$, (10)

$g_{j}(t)= \sum_{k=\mathfrak{o}}^{2^{J}-1}\beta_{j,k}\psi_{J^{4}}f(/)$. (11)

Inotherwords,thelevel-J multiresolution representationof$f(t)$

on

continuous time$t$isgivenby

$f_{J}(t) = \alpha 0.0\phi_{0,0}(t)+\sum_{j=0}^{J-1}\sum_{k4}^{2^{j}-1}\beta_{jJ}\psi_{J\grave{J}}(t\rangle$

.

(12) $\circ n$theother hand,in discrete time case,targetfunction$f_{f}$$(i=1,2, \cdots, n)$

can

be expandedin

a

similar fasion:

$f^{J},, = u0, \mathfrak{o}\phi_{0.0}(i)+\sum^{J-1}\sum_{-j=0\succ \mathfrak{o}}^{2^{J}-1}v_{j,k}\psi_{j,k}(l)$, (13)

where $u_{j,k}$ and$v_{jJ}$

are

discretescaling

coeﬃcients

and discrete wavelet coeﬃcients, respectively. Because the

approximation

error

between continuous and discrete coeﬃcients

can

be adjusted byfactor $\sqrt{n}$([1]),

we

have $u_{j}J = \frac{1}{\sqrt{n}}\sum_{j=1}^{n}f_{f}\phi_{jf}^{*}(i)$, (14)

$v_{jJ}$ $=$ $\frac{1}{\sqrt{n}}\sum_{1=1}^{n}f\psi_{jj}^{*}\langle\iota)$

.

$(1S)$

The mappingfromfunction$f_{i}$tocoeﬃcients $(u_{j\prime}, v_{j_{t}k})$ iscalledthediscrete Haarwavelet

transform

(DHWT),

whilethereconstruction offunction from coeﬃcients$(u_{\dot{j}}J, v_{Jt})$ is called the inverse discrete Haar wavelet

transform

(IDHWT).

FrominherencepropertyofHaarwavelet,$u_{\dot{j}}.k$and$v_{y.k}$

can

becalculated easily

as

follows.

For$j=J(J=\log_{2}n)$, $k=0$, 1,$\cdots,$ $n-1,$

$u_{j,k} = \frac{1}{\sqrt{n}}f_{k+1}$

.

(16)

For$j=J-1,$ $J-2,$$\cdots,$ $0,$ $k=0$, 1,$\cdots,$ $2^{j}-1,$

$u_{r\acute{J}} = \frac{1}{\sqrt{2}}(u_{j+1.2k}+u_{j+1,2k+1})$, (17)

$v_{j.k} = \frac{1}{\sqrt{2}}(u_{j+1,2k}-u_{j+1,2k+1})$

.

(18)

Furthermore,for$j=0$, 1,$\cdots,$ $J-1,$ $k=0$, 1,$\cdots,$ $2^{j}-1$,the IDHWTisachieved by

$u_{J+Ip_{k}} = \frac{1}{\sqrt{2}}(u_{j,2k}+v_{j.2k+1})$, (19)

$u_{j+1,2k+1} = \frac{1}{\sqrt{2}}(u_{J2k}-v_{jlk+}$ (20)

Let $s_{i}$ be the observation of$f(i=1,2, \ldots.n)$

.

Then, the empiricaldiscretescaling

coeﬃcients

$c_{j\lambda}$ and empirical$discre/e$_wavelet

_coeﬃcients

$d_{J\lambda}$

can

becalculated usingequations(14)and(15)with replacedby$s_{i}.$

(4)

Coeﬃcients$c_{j,k}$and$d_{j\cdot k}$

are

calledtheempirical

ones

because it isconsideredthatobservation

errors

are

included inside. Thatis,thenoises ofthe data

are

$considel\cdot ed$to be involvedinempiricaldiscretewavelet coeﬃcients$d_{j,k},$

and the threshold methods

can

beusedfordenoising. Up tonow,although

a

broadclassofthresholding schemes

are

available in the literature,the

common

choices include hard thresholding and

soft

thresholding, wherehard

thresholdingis

a

keep’

or

kill’rule,whilesoftthresholdingis

a

$shri\alpha$

or

kill’ rule. Thehardthresholdingis

defined

as

$\delta_{\tau}(d\rangle = d1_{|d|>\tau},$ (21)

and soft thresholdingis defined

as

$\delta_{\tau}(d) = s\mathfrak{R}(d)(|d|-\tau)_{+}$, (22)

forafixedthreshold level $r(>0)$,where $1_{A}$istheindicatorfunction of

an

event$A,$$sg\mathfrak{n}(d)$is the signfunctionof $d$_,and_{$(d)_{+}= \max(O, d)$}

.

Letting$d_{j,k}’$denotethe denoised empiricaldiscrete waveletcoeﬃcients,theestimatesof

targetfunction canbeobtainedbyapplyingIDHWT to$c_{j,k}$and$d_{j,k}’.$

3 Wavelet Shrinkage

Estimation

for NHPP-based

SRMs

Supposethat the number of software faults detectedthrough asystem test isobserved atdiscretetime $i=$

$O$, 1, 2, $\cdots$ Let

$Y_{i}$,and$N_{i}= \sum_{k=0}^{i}Y_{k}$respectivelydenote the number ofsoftwarefaultsdetectedat testingdate$i,$

andits cumulativevalue,where$Y_{0}=N_{0}=0$ is assumed withoutanyloss ofgenerality. Thestochastic process

$\{N_{i}:i=0, 1, 2, \cdots\}$issaidtobe

a

discretenon-homogeneousPoissonprocess($D$-NHPP)ifthe probability

mass

functionattime$i$isgiven by

$Pr\langle N_{l}=m\} = \frac{\{\Lambda_{f}^{m}}{m}!\exp\{-A_{i}\}, m=0, 1, 2, \cdots$_, ₍₂₃₎

where$\Lambda_{i}=E[N_{i}J$ is theexpectedcumulativenumber of software faults detected by testing date$i$

.

ThefUmction $\lambda_{j}=\Lambda_{i}-\Lambda_{i-1}(i\geq 1\rangle$iscalled the discrete intensityfunction,andi:npliestheexpectednumberoffaults$de\{$ecied

attesting date$i$,say$\lambda_{i}=E[Y_{i}]$

.

Inetherwords,

$Y, = \lambda_{i}+\eta_{7}, i=0, 1, 2\ldots$

.

(24)

Thewavelet-basedtechniques havebeenwell established especially in several

areas

such

as

non-parametric

regression,probability densityestimation,time-seriesanalysis,etc.Considel$\cdot$

thefollowingnon-parametric

regres-sionmodel:

$S, = z_{i}+\epsilon_{i}, i=1, 2, \cdots, n$, (25)

where$z_{\dot{f}}$

are

arbitrary targetfunctionsofdiscrete-time index ; and$\epsilon$

,

are

independentGaussianrandomvariables

with$N(O, \sigma^{2})$and$\sigma(>0)$

.

One ofthe basicapproachestotheGaussian non-parametric regression istoconsider

the unknown function$z_{i}$ expanded

as a

waveletseries and to transformthe original problemto

an

estimation of

the wavelet coeﬃcients from the data. DonohoandJohnstone$\mathfrak{l}6$,7,8]and Donohoetal. [9]proposednon-linear

waveletestimatorsof$z_{i}$,and suggestedtheextractionofthe significant wavelet coeﬃcientsbythresholding,where

waveletcoeﬃcients

are

set to$O$iftheirabsolutevalueisbelowa certainthreshold level. Then,applicationofthe

inverse wavelet transform ofdenoised coeﬃcientsbythresholding leads to

an

estimatoroftheunderlying target

function$z_{l}.$

Thisstandard wavelet-basedtechnique

can

beappliedtoanonparametricestimationofthediscretizedNHPP.

(5)

Table1: RepresentativeDataTransformaions(the

case

ofvarianceequalsto1/4).

Table 2: RepresentativeDataTransformaions(the

case

ofvarianceequalsto 1).

([3,4,2,10 Table 1 and Table 2showrepresentivevariance-stabilizingtransforms with diﬀerentvariances. By

using eitherof them, the

sottware

faultcountdata, say $y_{l}$,which isthe observation of$Y_{i}(i=1,2, \ldots, n)$, is

approximately transformed to theGaussiandata. That is, thetransformed realizations$s_{j}(i=1,2,$$\ldots,$ $n\rangle$by data

transformations

can

be regarded

as

the

ones

from the normallydistributed random variables:

$S_{f} = \lambda_{i}’+v_{i}, i=1, 2, \cdots, n$_, ₍₂₆₎

where$\lambda_{i}’$ isthe transformedsoftware intensity nation, and

$v_{t}$istheGaussianwhitenoisewith constantvariance. Then,$\lambda_{j}’$isthetargetfunctiontobe expandedbymultiresolution representation introduced in Section 2.

Forthresholding, too largethresholdmay cutoﬀimportant parts ofthe original function, whereas toosmall

thresholdretainsnoise in the selectivereconstruction.Hence,the choice ofthresholdis also

a

significant problem.

Infact manymethodstodeterminethethresholdlevelhave beenproposed(e.g. see[5]). In WSE([15, 16, 17

We

use

theuniversal threshold[6],and the’leave out-half

cross

validation threshold[13]:

$\tau = \frac{v}{\sqrt{n}}\sqrt{2\log n}$, (27)

$\tau = (1-\frac{\log 2}{\log n})^{-1/2}\tau(\frac{n}{2})$, (28)

where$v$is the standardvariationof empirical waveletcoeﬃcients,and$n$isthe lengthofthe observation$y_{1}$

.

Refer to[13]for details of$\tau(n/2)$.

4 Real

Data

Analysis

Although eight kinds of variance-stabiliz\’ingtransformations

were

usedinWSE,the most appropriateonefor

NHPP-based SRMs

was

notbe identified. Inthisexperiment,welook intothis problem. We

use

six setsof

soft-ware

fault countdata(groupdata)collectedfrom real projectdata sets([11 Let$(K,$ $n,$ $x_{n}\rangle$denote the triple of the softwaresize,thefinaltestingdateand the total number ofdetected fault. Thenthese datasets$(DS1 \sim DS6)$

(6)

arepresented by$(14kloc, 62weeks, 133)$,$(15kloc, 41weeks, 351)$,$(103kloc, 73weeks, 367)$,$(null, 46days, 266)$, $(300kb, 81days, 463\rangle, (2$ookloc, $11 ldays, 481)$, respectively. The MSE(meansquareserror)and LL$(\log$

likeli-hood)

_are

employed

as

the goodness-of-fit measures, where

MSE$i$ $=$

$\frac{\sqrt{\sum_{i=(}^{n}(\Lambda_{i}-x;\rangle^{2}}}{n}$

, (29)

$MSE_{2} = \frac{\sqrt{\sum_{j=1}^{n}(\lambda_{i}-y_{j})^{2}}}{n}$_, ₍₃₀₎

$LL$ $=$ $\sum_{:=1}^{n}(x_{i}-x_{i-i})\ln[\Lambda_{;}-\Lambda\vdash 1]-\Lambda_{n}-\sum_{j=1}^{/\iota}\ln[(x_{i}-x_{i-1})!].$ $(31\rangle$

We apply the WSE with two threshelding techniques (hard thresholding(h) v.s. softthresholding (s)), and

twothresholds(universalthreshold(ut)v.s. cross-validation threshold(cvt$\rangle$).Table 3presentsthegoodness-of-fit

results ofDS1,based

on

the fourthreshold methodsfor DS1 withdiﬀerent variance-stabilizingtransformations.

Due topagelimit,theresults shownhere

are

ofthe

case

where the highest resolution level$i$issettobe 3.

Forall the variance-stabilizingtransformations,it is observedthat whenuniversalthreshold isused, the

(7)

theaccuracyof the

case

of$l=1/4$is the

same

with

or

better than that of the

case

of$\nu^{2}=1$ _in_most

cases.

_The

reasons are

consideredtobe

as

follows. Forthe

case

ofuniversalthreshold, because the magnitude relationships

betweentransformed dataand theuniversal threshold

are

the same, the

same

empirical discrete wavelet

coeﬃ-cients$d_{JJ}$

are

set tobe O. Therefore, the accuracies

are

the

same

fromboththeoretical and experimental points

ofview. That is, thevarianceof the variance-stabilizingtransforms doesnot aﬀectthe accuracy of WSEwhen

universalthresholdispreferred. $\circ n$theotherhand,the

same

magnitude relationshipdoes notexistin the

case

of

cross-vaiidation threshold. Hence,theaccuracies

are

diﬀerent whendiﬀerent variance-stabilizingtransformations

are

used from theoretical point ofview.However,itisobserved in all the sixdata sets ofour experimentsthat,the

accuracyofthe

case

of $l=1/4$isthe

same

with

or

betterthan that ofthe

case

of$f=1$inmost

cases.

Therefore,

a

variance-stabilizing transformationwith smallervariance ispreferred.

5 Conclusion

Inthispaper,the eﬀectivenessof variance-stabilizingtransformations to waveletshrinkageestimation

was

in-vestigated. Fromthe numerical evaluation,wefound thattheaccuracyofthe

case

of $l=1/4$

was

the

same

with

orbetter thanthat of the

case

of$l=1$ inmost

cases.

Therefore, weconcluded that suchvariance-stabilizing

transformations thattransform originalvarianceto 1/4should be usedinwaveletshrinkage estimationto

assess

softwarereliability.

Acknowledgment

Thiswork

was

supported by JSPS KAKENHI GrantNumber26730039.

References

[1] F. Abramovich, T. C. Bailey, and T. Sapatinas, “Wavelet analysis and its statistical applications The

Statistician,49(1),pp. 1-29(2000).

[2] F.J.Anscombe, “The transformationofPoisson,binomial andnegative binomialdata,“Biometrika, vol. 35,

no.

3-4,pp.246-254, 1948.

[3] M.S.Bartlett, “Thesquareroottransformation in the analysisofvariance,”SupplementtotheJournal

_of

he

Royal StatisticalSociety,vol. 3,no. 1,pp.68-78,1936.

[4] M. S. Bartlett, “The

use

oftransformation Biometrics,vol.3,

no.

1,pp.39-52, 1947.

[5] P.Besbeas,I.DeFeis and T. Sapatinas, “A comparative simulation studyofwaveletshrinkageestimators for

Poisson counts International StatisticalReview,vol.72,$\iota lo.$2,pp.209-237,2004.

[6] D. L.DonohoandI. M. Johnstone, “Idealspatialadaptationby waveletshrinkage Biometrika, vol. 81,

no.

3,pp.425-455, 1994.

[7] D. L. Donohoand 1.M. Johnstone, $Adaptin_{g}$tounknown smoothnessvia wavelet_shrinkage _Journal

_of

the American StatisticalAssociation,vol.90,pp. $120\triangleright 1224$_, 1995.

[8] D.L.Donoho andI. M.Johnstone, “Minimaxestimation viawaveletshrinkage Annals

_of

Statistics,vol.

26,

no.

3,pp.879-921, 1998.

[9] D. L. Donoho, I. M.Johnstone, G. Kerkyacharianand D.Ricard, “Waveletshrinkage: asymptopia? (with

(8)

[10] M.$i^{i}$

.

Freemanand J. W. Tukey,

“Transformations relatedto theangular andthesquareroot,” The Annals

_of

MathematicalStatistics,vol.21,no.4,pp. 607$*$11, 1950.

[11] M. R. Lyu(ed.), Handbook

_ofSofware

ReliabilityEngineering, McGraw-Hill, New York,1996.

[12] J. D. Musa, A. lannino and K. Okumoto,

_Software

Reliability, Measurement, Prediction, Application,

McGraw-Hill, NewYork, 1987.

[13] G.P.Nason, “Waveletshrinkageusingcross-validation,” Journal

_of

theRoyalStatisticalSociety: SeriesB,

vol. 58,pp.463-479, 1996.

[14] H.Pham,

_Software

Reliability, Springer,Singapore,2000.

[15] X. Xiao andT. Dohi, “Wavelet-based approach forestimating software reliability Proceedings

of

20th

International Symposiumon

_SofAvare

Reliability Engineering(ISSRE’09),pp. 11-20, IEEE CSPress,2009.

[163 X. Xiao andT.Dohi, “A comparativestudy ofdatatransformations for wavelet shrinkageestimation with

application to software reliability assessment Advances in

_Software

Engineering, vol. 2012, ArticleID 524636,9

pages,

May2012.

[17] X.Xiao and T.Dohi, “Waveletshrinkageestimationfor non-homogenousPoisson processbased software